* GNAT Modification_Time limitation @ 2018-11-19 22:56 Lionel Draghi 2018-11-20 0:47 ` Shark8 ` (2 more replies) 0 siblings, 3 replies; 22+ messages in thread From: Lionel Draghi @ 2018-11-19 22:56 UTC (permalink / raw) I am coding a kind of make application, that depends on file's time tag (thanks to Ada.Directories.Modification_Time), and on Ada.Calendar.Clock, both returning Ada.Calendar.Time. Unfortunately, I came across a GNAT limitation in the Modification_Time implementation on Linux : sub-second are ignored, and Modification_Time returns > Time_Of (Year, Month, Day, Hour, Minute, Second, 0.0); So, at the same time Clock returns 2018-10-29 20:36:01.47 while Modification_Time returns 2018-10-29 20:36:01.00 This prevents me from knowing if a file is modified before or after certain time, and thus undermine my efforts. My workaround was to impair also Clock precision, with an ugly rounding: > Time := Ada.Calendar.Clock; > New_Time := Time_Of > (Year => Year (Time), > Month => Month (Time), > Day => Day (Time), > Seconds => Day_Duration (Float'Floor (Float (Seconds (Time))))); But that's not a correct solution either : I have to order lots of file creation, and having all files created during the same second returning the same time tag also prevent my algorithm from properly working. Any workaround to get a precise file time tag? Or to compare file's time tag with Clock? Thanks, -- Lionel ^ permalink raw reply [flat|nested] 22+ messages in thread
* Re: GNAT Modification_Time limitation 2018-11-19 22:56 GNAT Modification_Time limitation Lionel Draghi @ 2018-11-20 0:47 ` Shark8 2018-11-20 1:33 ` Keith Thompson 2018-11-20 1:33 ` Keith Thompson 2018-11-20 8:08 ` briot.emmanuel 2 siblings, 1 reply; 22+ messages in thread From: Shark8 @ 2018-11-20 0:47 UTC (permalink / raw) The problem with using the filesystem timestamp is that its resolution is too coarse compared to the processing-speed of your CPU. I would recommend either implementing some sort of controlled cache, version-control, or 'hacking' the timestamp so that it's a really a build-number (eg Build 1 -> 01 Jan 1900, build 2 -> 02 Jan 1900, build 35 -> 04 Feb 1900, etc). ^ permalink raw reply [flat|nested] 22+ messages in thread
* Re: GNAT Modification_Time limitation 2018-11-20 0:47 ` Shark8 @ 2018-11-20 1:33 ` Keith Thompson 0 siblings, 0 replies; 22+ messages in thread From: Keith Thompson @ 2018-11-20 1:33 UTC (permalink / raw) Shark8 <onewingedshark@gmail.com> writes: > The problem with using the filesystem timestamp is that its resolution > is too coarse compared to the processing-speed of your CPU. That depends on the filesystem. See my other followup in this thread. -- Keith Thompson (The_Other_Keith) kst@mib.org <http://www.ghoti.net/~kst> Will write code for food. "We must do something. This is something. Therefore, we must do this." -- Antony Jay and Jonathan Lynn, "Yes Minister" ^ permalink raw reply [flat|nested] 22+ messages in thread
* Re: GNAT Modification_Time limitation 2018-11-19 22:56 GNAT Modification_Time limitation Lionel Draghi 2018-11-20 0:47 ` Shark8 @ 2018-11-20 1:33 ` Keith Thompson 2018-11-20 23:32 ` Randy Brukardt 2018-11-20 8:08 ` briot.emmanuel 2 siblings, 1 reply; 22+ messages in thread From: Keith Thompson @ 2018-11-20 1:33 UTC (permalink / raw) Lionel Draghi <lionel.draghi@gmail.com> writes: > I am coding a kind of make application, that depends on file's time > tag (thanks to Ada.Directories.Modification_Time), and on > Ada.Calendar.Clock, both returning Ada.Calendar.Time. > > Unfortunately, I came across a GNAT limitation in the > Modification_Time implementation on Linux : sub-second are ignored, > and Modification_Time returns >> Time_Of (Year, Month, Day, Hour, Minute, Second, 0.0); > > So, at the same time Clock returns 2018-10-29 20:36:01.47 > while Modification_Time returns 2018-10-29 20:36:01.00 > > This prevents me from knowing if a file is modified before or after > certain time, and thus undermine my efforts. > > My workaround was to impair also Clock precision, with an ugly rounding: >> Time := Ada.Calendar.Clock; >> New_Time := Time_Of >> (Year => Year (Time), >> Month => Month (Time), >> Day => Day (Time), >> Seconds => Day_Duration (Float'Floor (Float (Seconds (Time))))); > > But that's not a correct solution either : I have to order lots of > file creation, and having all files created during the same second > returning the same time tag also prevent my algorithm from properly > working. > > Any workaround to get a precise file time tag? > Or to compare file's time tag with Clock? It's odd that GNAT's Modification_Time truncates the time to one-second precision. A quick experiment on my system (Ubuntu 18.04) also indicates that it does so, even though the system stores the timestamp in nanosecond precision. On Linux 2.6 and later, the underlying stat() system call gives you a "struct timespec" value for the modification time, as specified by the current POSIX standard. (struct timespec represents times with nanosecond precision.) A file system isn't required to store times with that precision, but many do. If you're on a POSIX system, you should be able to call the stat() system call and *probably* get a more precise timestamp. If you're on a non-POSIX system, there might still be a system-specific way to get a more precise timestamp. (NTFS also seems to store timestamps with high precision.) (And remember that nanosecond precision doesn't necessarily imply nanosecond accuracy.) -- Keith Thompson (The_Other_Keith) kst@mib.org <http://www.ghoti.net/~kst> Will write code for food. "We must do something. This is something. Therefore, we must do this." -- Antony Jay and Jonathan Lynn, "Yes Minister" ^ permalink raw reply [flat|nested] 22+ messages in thread
* Re: GNAT Modification_Time limitation 2018-11-20 1:33 ` Keith Thompson @ 2018-11-20 23:32 ` Randy Brukardt 2018-11-21 8:23 ` Dmitry A. Kazakov 0 siblings, 1 reply; 22+ messages in thread From: Randy Brukardt @ 2018-11-20 23:32 UTC (permalink / raw) "Keith Thompson" <kst-u@mib.org> wrote in message news:lnefbgr0rz.fsf@kst-u.example.com... ... > If you're on a non-POSIX system, there might still be a > system-specific way to get a more precise timestamp. (NTFS also > seems to store timestamps with high precision.) NTFS has three timestamps (modification, creation, and last access). Only the modification has high precision; the others are only good to full seconds (or something like that). FAT file systems (as you might encounter on a camera or USB stick) only have precision to 2 seconds. (Which is why we had to deal with this in the Janus/Ada build tools fairly early on.) Also note that the system clock on Windows systems typically only changes every 0.01 sec (Dmitry says this can be changed, although I've never seen that done). That extends to the file systems and other OS timers as well. Most Ada vendors use a Ada.Calendar.Clock that blends the system clock with the high performance timer to get useful accuracy of of Ada.Calendar.Time. (A customer/collaborator, Tom Moran, originally wrote that code the the Janus/Ada implementation of Calendar to fix some timing problem that he had. He eventually submitted similar code to AdaCore who added it to their Calendar as well.) Moral: Doing "Make" on a modern machine, especially if you want it to be portable, is a tricky job. Randy. ^ permalink raw reply [flat|nested] 22+ messages in thread
* Re: GNAT Modification_Time limitation 2018-11-20 23:32 ` Randy Brukardt @ 2018-11-21 8:23 ` Dmitry A. Kazakov 0 siblings, 0 replies; 22+ messages in thread From: Dmitry A. Kazakov @ 2018-11-21 8:23 UTC (permalink / raw) On 2018-11-21 00:32, Randy Brukardt wrote: > Also note that the system clock on Windows systems typically only changes > every 0.01 sec (Dmitry says this can be changed, although I've never seen > that done). The API call is timeBeginPeriod https://docs.microsoft.com/en-us/windows/desktop/api/timeapi/nf-timeapi-timebeginperiod The time resolution could be set down to 1ms (and never call timeEndPeriod as the page suggests (:-)) > Moral: Doing "Make" on a modern machine, especially if you want it to be > portable, is a tricky job. Yes, especially because the OS on the modern machine tends to deploy worst possible time source available. I guess that some MS-DOS code still does that job on your i9 ... -- Regards, Dmitry A. Kazakov http://www.dmitry-kazakov.de ^ permalink raw reply [flat|nested] 22+ messages in thread
* Re: GNAT Modification_Time limitation 2018-11-19 22:56 GNAT Modification_Time limitation Lionel Draghi 2018-11-20 0:47 ` Shark8 2018-11-20 1:33 ` Keith Thompson @ 2018-11-20 8:08 ` briot.emmanuel 2018-11-20 11:57 ` Lionel Draghi 2018-11-20 23:53 ` Randy Brukardt 2 siblings, 2 replies; 22+ messages in thread From: briot.emmanuel @ 2018-11-20 8:08 UTC (permalink / raw) > I am coding a kind of make application, that depends on file's time tag (thanks to Ada.Directories.Modification_Time), and on Ada.Calendar.Clock, both returning Ada.Calendar.Time. Interesting. I am in the middle of a discussion with AdaCore about gprbuild, which fails to recompile when using an alternative body that happens to have the same time stamp (to the second). gprbuild sees that the modification time appears to be the same, and thus doesn't recompile. Two points: - AdaCore mentioned they made progress recently on timestamp precision and it would likely fix the scenario. I think this is similar to what you reported, so it is likely your issue has been fixed now. - I am arguing with AdaCore that checking timestamps is not enough (might not even be useful at all), as Shark8 mentioned. The scenario I have is the following: Create a project with one scenario variable. Depending on that variable, chose src1 or src2 for source dirs. In each of these directories, have a file utils.adb with a different content. "touch" these two files so that they have the same timestamp. If you build your application once with one value of the variable, then rebuild with another value, gprbuild does nothing the second time. I had a similar real case because git created two files with the same timestamp. And then it took me days to understand why some of my tests appeared to be linked with both versions of utils.adb, since I could see in the log file traces from both src1/utils.adb and src2/utils.adb. Very very confusing. So I would indeed recommend that you don't bother with timestamps, and only look at file contents (or use timestamp+file path at the very least, or perhaps inodes). I am interested in hearing more why you want to code a new 'make-like' ? Now trying to persuade AdaCore that gprbuild's behavior is incorrect... Emmanuel ^ permalink raw reply [flat|nested] 22+ messages in thread
* Re: GNAT Modification_Time limitation 2018-11-20 8:08 ` briot.emmanuel @ 2018-11-20 11:57 ` Lionel Draghi 2018-11-21 7:40 ` briot.emmanuel 2018-11-20 23:53 ` Randy Brukardt 1 sibling, 1 reply; 22+ messages in thread From: Lionel Draghi @ 2018-11-20 11:57 UTC (permalink / raw) Thank you guys for your answers : @Shark : see the description of my app hereafter, I will try the simple way first :-) @Keith and Emmanuel : the Time_Of call I put in my message comes from the body of Ada.Directories (/opt/GNAT/2018/lib/gcc/x86_64-pc-linux-gnu/7.3.1/adainclude/a-direct.adb) : ... Date := File_Time_Stamp (Name); GM_Split (Date, Year, Month, Day, Hour, Minute, Second); return Time_Of (Year, Month, Day, Hour, Minute, Second, 0.0); ... and GM_Split (in System.OS_Lib package) is calling procedure To_GM_Time (P_Time_T : Address; P_Year : Address; P_Month : Address; P_Day : Address; P_Hours : Address; P_Mins : Address; P_Secs : Address); pragma Import (C, To_GM_Time, "__gnat_to_gm_time"); P_Secs is pointing an Integer. So the limitation seems to come from GNAT C interface to OS lib. @Keith : my App is (in this first version) using strace, so thanks for the stat idea, I should directly get the OS time stamp from strace output. @Emmanuel : my make is a POC to do a make without makefile! :-) it runs command and observes files accesses (thanks to linux kernel ptrace interface), and automatically understand what files it depends on, and what files are output. My first test case is to replace this Makefile: all: hello hello.o: hello.c gcc -o hello.o -c hello.c main.o: main.c hello.h gcc -o main.o -c main.c hello: hello.o main.o gcc -o hello hello.o main.o with just : gcc -o hello.o -c hello.c gcc -o main.o -c main.c gcc -o hello hello.o main.o and to get the same optimized behavior when removing a .o file or touching one of the source files. -- -- Lionel ^ permalink raw reply [flat|nested] 22+ messages in thread
* Re: GNAT Modification_Time limitation 2018-11-20 11:57 ` Lionel Draghi @ 2018-11-21 7:40 ` briot.emmanuel 2018-11-21 11:16 ` briot.emmanuel 2018-11-21 19:02 ` Lionel Draghi 0 siblings, 2 replies; 22+ messages in thread From: briot.emmanuel @ 2018-11-21 7:40 UTC (permalink / raw) > @Emmanuel : my make is a POC to do a make without makefile! :-) > it runs command and observes files accesses (thanks to linux kernel ptrace interface), and automatically understand what files it depends on, and what files are output. There was an article earlier this week on reddit about `redo`, which seems to have a similar idea of top-down compilation: you have a linker script that tells redo it needs a.o, b.o and c.o (then redo recursively processes those), and finally does the link. In turn, for a.o you would tell redo it needs a.ads, a.adb and b.ads, and then compile,... With your idea of using ptrace, that would be an automatic way maybe to tell redo about the dependency graph. I am not sure redo would be really usable on actual projects though. You have to list the dependencies for the linker for instance (I much prefer the gprbuild approach of finding those automatically). A similar limitation seems to exist in your POC: how do I, as a novice user, know what to compile in the first place ? It seems you would need a combination of what gprbuild does, with ptrace: - compile (with ptrace) the main unit. - gprbuild then uses the ALI file to find the dependencies, and check those recursively. - in your case, you would instead look at the ptrace output to find those dependencies. The ptrace approach would be much more reliable (though linux-specific), since you would know for instance: - that the compiler searched and did not find foo,ads in /first/dir - found and opened /other/dir/foo.ads so next time there is a build you can check first whether 'foo.ads' now exists in /first/dir. If that file now exists, you need to rebuild. gprbuild doesn't handle such changes on the system, it only store what it found. (this is all an interesting concept I learned this week from `redo`) Let us know the result of the experiment ! Emmanuel ^ permalink raw reply [flat|nested] 22+ messages in thread
* Re: GNAT Modification_Time limitation 2018-11-21 7:40 ` briot.emmanuel @ 2018-11-21 11:16 ` briot.emmanuel 2018-11-21 19:13 ` Lionel Draghi 2018-11-21 19:02 ` Lionel Draghi 1 sibling, 1 reply; 22+ messages in thread From: briot.emmanuel @ 2018-11-21 11:16 UTC (permalink / raw) > The ptrace approach would be much more reliable (though linux-specific), since you would know > for instance: > > - that the compiler searched and did not find foo,ads in /first/dir > - found and opened /other/dir/foo.ads > > so next time there is a build you can check first whether 'foo.ads' now exists in /first/dir. If that file > now exists, you need to rebuild. > gprbuild doesn't handle such changes on the system, it only store what it found. Slightly out of topic (sorry): I found tup (http://gittup.org/tup/index.html) which appears to be doing exactly what you want to achieve. It monitors file accesses but it uses a fuse filesystem for this, rather than ptrace. I had implemented a fuse filesystem in Ada at some point, though I do not have that code anymore. AdaCore was using that to access a database that contains all build+tests results on all possible combinations, if I remember right. ^ permalink raw reply [flat|nested] 22+ messages in thread
* Re: GNAT Modification_Time limitation 2018-11-21 11:16 ` briot.emmanuel @ 2018-11-21 19:13 ` Lionel Draghi 0 siblings, 0 replies; 22+ messages in thread From: Lionel Draghi @ 2018-11-21 19:13 UTC (permalink / raw) Le mercredi 21 novembre 2018 12:16:12 UTC+1, briot.e...@gmail.com a écrit : ... > Slightly out of topic (sorry): I found tup (http://gittup.org/tup/index.html) which appears to be doing > exactly what you want to achieve. It monitors file accesses but it uses a fuse filesystem for this, rather > than ptrace. > Very interresting information for me at least :-), thank you. Not sure the goal is the same. I see on http://gittup.org/tup/ex_a_first_tupfile.html a small exemple of tupfile, and it give's both the input and the target with the command: : hello.c |> gcc hello.c -o hello |> hello This is what I try to avoid! (not to mention one more specific format) ^ permalink raw reply [flat|nested] 22+ messages in thread
* Re: GNAT Modification_Time limitation 2018-11-21 7:40 ` briot.emmanuel 2018-11-21 11:16 ` briot.emmanuel @ 2018-11-21 19:02 ` Lionel Draghi 2018-11-21 19:48 ` Simon Wright 1 sibling, 1 reply; 22+ messages in thread From: Lionel Draghi @ 2018-11-21 19:02 UTC (permalink / raw) Le mercredi 21 novembre 2018 08:40:08 UTC+1, briot.e...@gmail.com a écrit : ... > With your idea of using ptrace, that would be an automatic way maybe to tell > redo about the dependency graph. Exactly, the idea of the POC is see how far we can go without any explicit description of the dependency graph, or whatever build recipes. ... > A similar limitation seems to exist in your POC: how do I, as a novice user, > know what to compile in the first place ? It's not in my scope : I don't target making easier compilations (I don't pretend doing a better job than gprbuild or so), just running smartly a list of command. I used a C compilation exemple as it's a classical make exemple, but it could be whatever suite of command : latex <file>.tex dvips <file>.dvi ps2pdf <file>.ps pdf2eps <pagenumber> <file> And gprbuild, or even a complex make could be one those command. ... > The ptrace approach would be much more reliable (though linux-specific), since you would know > for instance: > > - that the compiler searched and did not find foo,ads in /first/dir > - found and opened /other/dir/foo.ads > > so next time there is a build you can check first whether 'foo.ads' now exists in /first/dir. If that file > now exists, you need to rebuild. Exactly my intent. And to build the dependency graph, I need to identify which file is an input file, and which one is an output (a target). To do so, I can either: 1. make a complex analysis of a detailed strace log file on each file operation; 2. just ask strace the list of the involved files, and classify those file thanks to modification time : if file modification time > execution time, then it's an output. The second option seems to be far less complex, but I need enough precision in time stamps to discriminate if a file is older than the command run time or not. Note also that I could store a hashtag for each used file to check if the file is the same without getting in all those time tag problems (I am pretty sure most OSes propose such services). It would certainly be useful and reliable to decide re-executing a command, but wouldn't help to classify if the used file was only read, or an output. So, I didn't investigate in that direction. -- Lionel ^ permalink raw reply [flat|nested] 22+ messages in thread
* Re: GNAT Modification_Time limitation 2018-11-21 19:02 ` Lionel Draghi @ 2018-11-21 19:48 ` Simon Wright 2018-11-21 22:14 ` Lionel Draghi 0 siblings, 1 reply; 22+ messages in thread From: Simon Wright @ 2018-11-21 19:48 UTC (permalink / raw) Lionel Draghi <lionel.draghi@gmail.com> writes: > And to build the dependency graph, I need to identify which file is an > input file, and which one is an output (a target). > > To do so, I can either: > 1. make a complex analysis of a detailed strace log file on each file > operation; > 2. just ask strace the list of the involved files, and classify those > file thanks to modification time : if file modification time > > execution time, then it's an output. Can't you tell from strace which files were opened for read and which for write? I suppose there are some files that are opened read/write; either, perhaps most usually, in separate parts of the build, or by being updated in one. I have one project (tcladashell) which runs a tcl script to generate a C source, which is compiled, built, and run to generate an Ada package spec. Which is then used in the rest of the build. ^ permalink raw reply [flat|nested] 22+ messages in thread
* Re: GNAT Modification_Time limitation 2018-11-21 19:48 ` Simon Wright @ 2018-11-21 22:14 ` Lionel Draghi 0 siblings, 0 replies; 22+ messages in thread From: Lionel Draghi @ 2018-11-21 22:14 UTC (permalink / raw) Le mercredi 21 novembre 2018 20:48:40 UTC+1, Simon Wright a écrit : ... > Can't you tell from strace which files were opened for read and which > for write? Yes, strace can monitor every call to system API, with parameters. I taught the time tag way was easier, but it may be time to change my mind :-) > I suppose there are some files that are opened read/write; either, > perhaps most usually, in separate parts of the build, or by being > updated in one. strace -f monitors also sub processes (-f stands for "follow forks") That's handy. ^ permalink raw reply [flat|nested] 22+ messages in thread
* Re: GNAT Modification_Time limitation 2018-11-20 8:08 ` briot.emmanuel 2018-11-20 11:57 ` Lionel Draghi @ 2018-11-20 23:53 ` Randy Brukardt 2018-11-21 7:31 ` briot.emmanuel 1 sibling, 1 reply; 22+ messages in thread From: Randy Brukardt @ 2018-11-20 23:53 UTC (permalink / raw) <briot.emmanuel@gmail.com> wrote in message news:04221674-95d8-4d4a-8743-42877b13eead@googlegroups.com... ... >I had a similar real case because git created two files with the same >timestamp. And then it took me days to understand why some of >my tests appeared to be linked with both versions of utils.adb, since >I could see in the log file traces from both src1/utils.adb and >src2/utils.adb. Very very confusing. > >So I would indeed recommend that you don't bother with timestamps, >and only look at file contents (or use timestamp+file path at the very >least, or perhaps inodes). I wouldn't claim that the situation is that dire; it seems to be related to the particular implementation of a particular GNAT feature (project scenario variables). If you're not implementing something where the source code location can be changed for a particular build, then timestamps will work (but you have to remember that they are quite granular). It also seems to be related in part of source-based compilation (which necessarily keeps less information between builds). In a Janus/Ada project (which is very different than a GNAT project -- it's a binary DB-like file of compilation information), changing the location of a source file would invalidate the entire entry and essentially delete any existing compilations. More likely, however, is that a scenario would be set up using separate project files (most likely using Windows batch files/Unix shell-scripts to automate), so each would have their own set of compilation states. And it's completely impossible to bind multiple versions of a unit into a single executable; only one or the other could be selected - and if somehow some files were compiled against the wrong one, some or all of the compilation timestamps wouldn't match (which would cause binding failure). The moral here is how to implement a Make-like tool depends a lot on what capabilities it will have. Randy. ^ permalink raw reply [flat|nested] 22+ messages in thread
* Re: GNAT Modification_Time limitation 2018-11-20 23:53 ` Randy Brukardt @ 2018-11-21 7:31 ` briot.emmanuel 2018-11-21 14:38 ` Shark8 ` (2 more replies) 0 siblings, 3 replies; 22+ messages in thread From: briot.emmanuel @ 2018-11-21 7:31 UTC (permalink / raw) > I wouldn't claim that the situation is that dire; it seems to be related to > the particular implementation of a particular GNAT feature (project scenario > variables). If you're not implementing something where the source code > location can be changed for a particular build, then timestamps will work > (but you have to remember that they are quite granular). The trick of course is to define what a "build" is in your sentence. If it is one execution of the builder (gprbuild, make,...) then I think it is indeed a reasonable assertion. If however a build is defined to something that amount to "in debug mode, in production mode,..." then of course it might happen that the sources are changed and the timestamp have a timestamp delta of less than 1s (when we generate code for instance). Furthermore, the actual scenario was the following: in the automatic tests, I need to simulate the connection to the database, so that means I need to have support for alternate bodies (but I still compile in debug mode, or production mode,...). Is that still the same "build" ? I would guess it is, but in the end we would end up with literally dozens of "build" types, each with its own set of object files, and each taking 20 or 30 minutes to build from scratch. Not realistic for continuous testing. I spent some time looking around at general builder tools around. Most of them seem to advertise nowadays that they look at file contents, not timestamps. I started from the list at https://en.wikipedia.org/wiki/List_of_build_automation_software, and looked at a few of them. > It also seems to be related in part of source-based compilation (which > necessarily keeps less information between builds). In a Janus/Ada project > (which is very different than a GNAT project -- it's a binary DB-like file > of compilation information), changing the location of a source file would > invalidate the entire entry and essentially delete any existing > compilations. More likely, however, is that a scenario would be set up using > separate project files (most likely using Windows batch files/Unix > shell-scripts to automate), so each would have their own set of compilation > states. That's more or less what gprbuild does in practice. It uses a "distributed database" via the .ALI files, which are found in the object directories, so for best use each "build" should have a different object directories. And we are again hitting the notion of "build". > And it's completely impossible to bind multiple versions of a unit > into a single executable; only one or the other could be selected - That's indeed one of the ways gprbuild could detect the error. To me it is a bug in gprbuild that it allows linking different files for the same unit into the same executable. > somehow some files were compiled against the wrong one, some or all of the > compilation timestamps wouldn't match (which would cause binding failure). timestamps are not reliable enough, especially on modern fast machines. I am pretty sure you will hit a similar issue I had, one day. ^ permalink raw reply [flat|nested] 22+ messages in thread
* Re: GNAT Modification_Time limitation 2018-11-21 7:31 ` briot.emmanuel @ 2018-11-21 14:38 ` Shark8 2018-11-21 17:32 ` Simon Wright 2018-11-21 23:34 ` Randy Brukardt 2 siblings, 0 replies; 22+ messages in thread From: Shark8 @ 2018-11-21 14:38 UTC (permalink / raw) On Wednesday, November 21, 2018 at 12:31:13 AM UTC-7, briot.e...@gmail.com wrote: > > I spent some time looking around at general builder tools around. Most of them seem to > advertise nowadays that they look at file contents, not timestamps. I started from the list > at https://en.wikipedia.org/wiki/List_of_build_automation_software, and looked at a few of them. I read that as https://en.wikipedia.org/wiki/List_of_build_abomination_software and had to do a double take. ^ permalink raw reply [flat|nested] 22+ messages in thread
* Re: GNAT Modification_Time limitation 2018-11-21 7:31 ` briot.emmanuel 2018-11-21 14:38 ` Shark8 @ 2018-11-21 17:32 ` Simon Wright 2018-11-21 17:43 ` briot.emmanuel 2018-11-21 23:34 ` Randy Brukardt 2 siblings, 1 reply; 22+ messages in thread From: Simon Wright @ 2018-11-21 17:32 UTC (permalink / raw) briot.emmanuel@gmail.com writes: > That's more or less what gprbuild does in practice. It uses a > "distributed database" via the .ALI files, which are found in the > object directories, so for best use each "build" should have a > different object directories. And we are again hitting the notion of > "build". Ideally, each distinct set of scenario variable values should have its own object directory. Will take a lot of time for the initial compilations, of course. ^ permalink raw reply [flat|nested] 22+ messages in thread
* Re: GNAT Modification_Time limitation 2018-11-21 17:32 ` Simon Wright @ 2018-11-21 17:43 ` briot.emmanuel 0 siblings, 0 replies; 22+ messages in thread From: briot.emmanuel @ 2018-11-21 17:43 UTC (permalink / raw) > Ideally, each distinct set of scenario variable values should have its > own object directory. Will take a lot of time for the initial > compilations, of course. That's actually more than that. We already use the above (and indeed we have like 5 or 6 major scenarios, thankfully we do not compile quite all the possible combiinations). But in the context of tests, we use extending projects to override some of the sources (for instance so that we do not have to actually have a database running). The test project itself is an extending-all. So if you have the simple case: a.gpr imports b.gpr imports c.gpr imports d.gpr and need to substitute a body for a file c.adb in C. you then extend that project, and make a2.gpr an extending-all project, thus we now have: a.gpr imports b.gpr imports c.gpr imports d.gpr | a2.gpr imports b.gpr imports c2.gpr imports d.gpr The scenario variables have not changed, so b's objects will go in the 'obj-production' directory as before, for instance. But in fact, some of object files now depend on that alternate body of c.adb. If you had some inlined subprograms in c.adb (using -gnatn), then part of their code is in b.o. In the common (and optimistic) case where c.adb has a different timestamp from before, b.o will be recompiled and all is fine. If c.adb has the same timestamp as the original file (because, hey, git does what it wants), gprbuild doesn't notice the change in c.adb, so doesn't recompile b.o, and when we link the executable we go some case from the old c.adb (the inlined code). This is why just checking the timestamp is not (cannot) be good enough. Ideally, we should try and use a different object directory here (though the scenario is the same), but I don't know how to do that (b.gpr hasn't changed, thanks to the extend-all project). And if you add to the original 5 scenario variables another case where you can potentially mock any number of project, you end up with way too many combinations of object directories, my disk would not be big enough I think. ^ permalink raw reply [flat|nested] 22+ messages in thread
* Re: GNAT Modification_Time limitation 2018-11-21 7:31 ` briot.emmanuel 2018-11-21 14:38 ` Shark8 2018-11-21 17:32 ` Simon Wright @ 2018-11-21 23:34 ` Randy Brukardt 2018-11-22 8:15 ` briot.emmanuel 2 siblings, 1 reply; 22+ messages in thread From: Randy Brukardt @ 2018-11-21 23:34 UTC (permalink / raw) <briot.emmanuel@gmail.com> wrote in message news:62ffa1fb-6733-4f97-ba87-ae3103bfc877@googlegroups.com... >> I wouldn't claim that the situation is that dire; it seems to be related >> to >> the particular implementation of a particular GNAT feature (project >> scenario >> variables). If you're not implementing something where the source code >> location can be changed for a particular build, then timestamps will work >> (but you have to remember that they are quite granular). > > The trick of course is to define what a "build" is in your sentence. > If it is one execution of the builder (gprbuild, make,...) then I think it > is indeed > a reasonable assertion. > > If however a build is defined to something that amount to "in debug mode, > in production mode,..." > then of course it might happen that the sources are changed and the > timestamp have a timestamp > delta of less than 1s (when we generate code for instance). I'd argue that these are something else on top of individual builds. And that it is a mistake trying to combine basic building with those higher-level configuration management things. I've struggled with those higher level issues almost since the beginning of RR Software (we've almost always supported multiple targets for Janus/Ada). Neither conventional build tools nor configuration management tools are any help whatsoever for managing those situations. I've seen various attempts to do so, but none of them address the underlying issues very well. > Furthermore, the actual scenario was the following: in the automatic > tests, I need to simulate the > connection to the database, so that means I need to have support for > alternate bodies (but I still > compile in debug mode, or production mode,...). Is that still the same > "build" ? No, at least four separate builds. > I would guess it is, but in the end we would end up with literally dozens > of "build" types, each with > its own set of object files, and each taking 20 or 30 minutes to build > from scratch. Not realistic > for continuous testing. You need a faster compiler. :-) :-) Seriously, at least debug vs. production has to be built over from scratch (at least the way I typically use those). The debug version uses different compiler options and the production version, as debug symbols need to be generated, some optimizations need to be turned off, and then the production version turns off various Ada checking. So to switch from one to the other requires a full rebuild anyway. In recent years, I've avoided that problem by keeping multiple projects for debug and production and various targets, and thus (re)building each individually as needed. Disk space is plentiful on modern machines -- it's my time that's limited. > I spent some time looking around at general builder tools around. Most of > them seem to > advertise nowadays that they look at file contents, not timestamps. I > started from the list > at https://en.wikipedia.org/wiki/List_of_build_automation_software, and > looked at a few of them. For source code, I tend to agree. The Janus/Ada COrder tool always had an option to read the source files instead of depending on timestamps. And Janus/Ada puts the timestamps into the compilation results, so that they can't be clobbered by file operations. In any case, source code is only part of the picture. >> It also seems to be related in part of source-based compilation (which >> necessarily keeps less information between builds). In a Janus/Ada >> project >> (which is very different than a GNAT project -- it's a binary DB-like >> file >> of compilation information), changing the location of a source file would >> invalidate the entire entry and essentially delete any existing >> compilations. More likely, however, is that a scenario would be set up >> using >> separate project files (most likely using Windows batch files/Unix >> shell-scripts to automate), so each would have their own set of >> compilation >> states. > > That's more or less what gprbuild does in practice. It uses a "distributed > database" > via the .ALI files, which are found in the object directories, so for best > use each > "build" should have a different object directories. And we are again > hitting the notion > of "build". Precisely. Higher-level things than raw builds are best kept separate at the compilation artifact level. >> And it's completely impossible to bind multiple versions of a unit >> into a single executable; only one or the other could be selected - > > That's indeed one of the ways gprbuild could detect the error. To me it is > a bug in > gprbuild that it allows linking different files for the same unit into the > same executable. I believe that is a result of the way GNAT compiles files -- the package specifications are never materialized, so it would be hard for it to have any compilation result which could tell which one is used. I've seen this sort of effect working on ACATS tests, and I've never had any reason to use GPRBuild for that. >> somehow some files were compiled against the wrong one, some or all of >> the >> compilation timestamps wouldn't match (which would cause binding >> failure). > > timestamps are not reliable enough, especially on modern fast machines. I > am pretty > sure you will hit a similar issue I had, one day. I'm sorry, I confused you here. I was talking about the timestamps that Janus/Ada records for compilation units when they are compiled. These are internal to the SYM files (which are a representation of the Ada symboltable for a library unit), and used to determine which version is "with"ed in other files. They're only compared for equality (other than for the purposes of error messages). Even on a FAT system, these have 2 second granularity. You could only have a problem if the same specification is recompiled twice in under 2 seconds. It's hard to imagine a build taking less than two seconds; certainly not if a human is involved, and very unlikely even if automated. (The Janus/Ada binder is fairly slow as it removes unreachable subprograms recursively -- that's required for Windows programs because the presence of binding for a variety of Windows versions -- and as such it takes multiple seconds for all but the most trivial programs.) On more modern systems, we're talking hundredths of seconds granularity; it's essentially impossible for multiple builds to happen that fast. COrder (the Janus/Ada compilation order tool that's at the heart of any builds) has an old /T option that uses file timestamps, but it has not been recommended for a while. It's faster than the /I option that inspects the internal timestamps and the source code ('cause it doesn't have to open hundreds of files and read part of them), but it messes up so often it is not recommended anymore. (One nice side-effect of /I is that one can simply delete all of the SYM files to force a rebuild of everything; /T doesn't always rebuild everything in that case.) In any case, timestamps have their place, but they have to be used carefully. Randy. ^ permalink raw reply [flat|nested] 22+ messages in thread
* Re: GNAT Modification_Time limitation 2018-11-21 23:34 ` Randy Brukardt @ 2018-11-22 8:15 ` briot.emmanuel 2018-11-26 23:45 ` Randy Brukardt 0 siblings, 1 reply; 22+ messages in thread From: briot.emmanuel @ 2018-11-22 8:15 UTC (permalink / raw) > I'd argue that these are something else on top of individual builds. And > that it is a mistake trying to combine basic building with those > higher-level configuration management things. Not sure about the concepts about you describe. For me, "basic building" is one run of the compiler on one specific file, which always recompiles, no question asked. What we are talking about in this thread a tools at the level of make and gprbuild, that decide what should be compiled, and when. For me, this is thte "higher-level build management part". That includes configuration management, since this tool is also responsible for deciding where the build artifacts (object files, executables,...) should be stored. > > its own set of object files, and each taking 20 or 30 minutes to build > > from scratch. Not realistic for continuous testing. > > You need a faster compiler. :-) :-) I wish I had one. Still, this is compiling 6000 Ada units + C files iin 20 minutes, not too bad (using parallel builds, of course). > Seriously, at least debug vs. production has to be built over from scratch > (at least the way I typically use those). Definitely. And I agree with your conclusion that we need separate builds for every possible combination of environment/switches (debug, production,...) and source files (alternate bodies,...). Disk space is cheaper than our time, though fast SSDs are still not quite as cheap as we would all like. > It's hard to imagine a build taking less than two seconds; certainly not if > a human is involved, and very unlikely even if automated. Don't forget that Lionel's make-like tool and gprbuild are both meant to be language-neutral. Compiling an Ada file in less than 2s is rare nowadays (but possible with very simple files. But compiling a python file takes a few ms, so just looking at timestamps cannot be enough (though when compilation is that fast, it doesn't matter much to redo it more often... > (The Janus/Ada > binder is fairly slow as it removes unreachable subprograms recursively -- Nice feature. With gcc we use link time optimization to achieve the same effect (and more), and that's slow indeed. > COrder (the Janus/Ada compilation order tool that's at the heart of any > builds) has an old /T option that uses file timestamps, but it has not been > recommended for a while. It's faster than the /I option that inspects the > internal timestamps and the source code ('cause it doesn't have to open > hundreds of files and read part of them), but it messes up so often it is > not recommended anymore. (One nice side-effect of /I is that one can simply > delete all of the SYM files to force a rebuild of everything; /T doesn't > always rebuild everything in that case.) One interesting of the TUP builder I mentioned yesterday is that it comes with an optional daemon program, that monitors the changes on the file system (inotify on linux), so that when you start the build it already knows what files have been modified and can start building right away. Saving 10s or more every time I launch gprbuild would be nice ! > In any case, timestamps have their place, but they have to be used > carefully. Seconded. They can be used as a shortcut: the builder can have a mode that says "assume the file was modified if the timestamp has changed, but if the timestamp is the same, check the contents". And then a "minimal recompilation switch" that says "only look at file contents to detect whether file has changed". Emmanuel ^ permalink raw reply [flat|nested] 22+ messages in thread
* Re: GNAT Modification_Time limitation 2018-11-22 8:15 ` briot.emmanuel @ 2018-11-26 23:45 ` Randy Brukardt 0 siblings, 0 replies; 22+ messages in thread From: Randy Brukardt @ 2018-11-26 23:45 UTC (permalink / raw) <briot.emmanuel@gmail.com> wrote in message news:0d4679d2-30d7-493f-b9fd-688d044e1a4e@googlegroups.com... >> I'd argue that these are something else on top of individual builds. And >> that it is a mistake trying to combine basic building with those >> higher-level configuration management things. > > Not sure about the concepts about you describe. For me, "basic building" > is one run of the compiler on one specific file, which always recompiles, > no question asked. > What we are talking about in this thread a tools at the level of make and > gprbuild, that decide what should be compiled, and when. For me, this is > thte "higher-level build management part". That includes configuration > management, since this tool is also responsible for deciding where the > build artifacts (object files, executables,...) should be stored. I think of a single compilation as a "compilation", while a "build" to me is something that results in one or more executable files, generally based on the source code found in a single directory. Going further (and we have some such features, particularly for sharing source/object between multiple distinct builds) is part of the higher-level management. (I don't have an simple name for that, which shows yet again how hard it is to describe.) Ada of course allows "build" to be completely automated without any outside intervention at all. In theory, it's only necessary to point the build tool at the pile of source code. ... >> It's hard to imagine a build taking less than two seconds; certainly not >> if >> a human is involved, and very unlikely even if automated. > > Don't forget that Lionel's make-like tool and gprbuild are both meant to > be > language-neutral. Compiling an Ada file in less than 2s is rare nowadays > (but > possible with very simple files. But compiling a python file takes a few > ms, > so just looking at timestamps cannot be enough (though when compilation > is that fast, it doesn't matter much to redo it more often... Again, a "build" in my view is compiling the set of files needed to create an executable. (Again, I'll ignore the management of shared libraries.) That generally requires the compilation of multiple files, and a linking phase as well. Moreover, unless you are running multiple builds from some higher-level tool, there's also human reaction time involved. The likelihood of that happening faster than 2 seconds isn't high. The issues I've seen almost always come from someone terminating a compilation in the middle without letting the compiler clean up any half-created artifacts. Of course, most other languages need a lot of help to determine dependencies (information that is directly part of the Ada source code). That need for help has confused the issues a lot, because however you give it can't be automatic nor bullet-proof. Thus, this gets mixed up with the higher level issues. Ada only needs that help at a higher level than basic building; basic building should be automatic. I've even had a customer (with large, complex systems) tell me that they didn't want the Ada compiler to even try to manage such things. They wanted to grab some set of source from version control and essentially have the compiler build it from that source (all found in one large glob in a single directory). They thought that build times were short enough that it wasn't worth the intermediate steps to avoid recompilations. I've rather thought that was the future of such tools; some higher-level management (probably from the configuration management system) where whatever the compiler does would seem to get in the way. Randy. ^ permalink raw reply [flat|nested] 22+ messages in thread
end of thread, other threads:[~2018-11-26 23:45 UTC | newest] Thread overview: 22+ messages (download: mbox.gz / follow: Atom feed) -- links below jump to the message on this page -- 2018-11-19 22:56 GNAT Modification_Time limitation Lionel Draghi 2018-11-20 0:47 ` Shark8 2018-11-20 1:33 ` Keith Thompson 2018-11-20 1:33 ` Keith Thompson 2018-11-20 23:32 ` Randy Brukardt 2018-11-21 8:23 ` Dmitry A. Kazakov 2018-11-20 8:08 ` briot.emmanuel 2018-11-20 11:57 ` Lionel Draghi 2018-11-21 7:40 ` briot.emmanuel 2018-11-21 11:16 ` briot.emmanuel 2018-11-21 19:13 ` Lionel Draghi 2018-11-21 19:02 ` Lionel Draghi 2018-11-21 19:48 ` Simon Wright 2018-11-21 22:14 ` Lionel Draghi 2018-11-20 23:53 ` Randy Brukardt 2018-11-21 7:31 ` briot.emmanuel 2018-11-21 14:38 ` Shark8 2018-11-21 17:32 ` Simon Wright 2018-11-21 17:43 ` briot.emmanuel 2018-11-21 23:34 ` Randy Brukardt 2018-11-22 8:15 ` briot.emmanuel 2018-11-26 23:45 ` Randy Brukardt
This is a public inbox, see mirroring instructions for how to clone and mirror all data and code used for this inbox