Filenames and separate compilation separate compilation recompilation checker make and recompilation This section describes what files GHC expects to find, what files it creates, where these files are stored, and what options affect this behaviour. Note that this section is written with hierarchical modules in mind (see ); hierarchical modules are an extension to Haskell 98 which extends the lexical syntax of module names to include a dot ‘.’. Non-hierarchical modules are thus a special case in which none of the module names contain dots. Pathname conventions vary from system to system. In particular, the directory separator is ‘/’ on Unix systems and ‘\’ on Windows systems. In the sections that follow, we shall consistently use ‘/’ as the directory separator; substitute this for the appropriate character for your system. Haskell source files Each Haskell source module should be placed in a file on its own. The file should usually be named after the module name, by replacing dots in the module name by directory separators. For example, on a Unix system, the module A.B.C should be placed in the file A/B/C.hs, relative to some base directory. GHC's behaviour if this rule is not followed is fully defined by the following section (). Output files interface files .hi files object files .o files When asked to compile a source file, GHC normally generates two files: an object file, and an interface file. The object file, which normally ends in a .o suffix (or .obj if you're on Windows), contains the compiled code for the module. The interface file, which normally ends in a .hi suffix, contains the information that GHC needs in order to compile further modules that depend on this module. It contains things like the types of exported functions, definitions of data types, and so on. It is stored in a binary format, so don't try to read one; use the option instead (see ). You should think of the object file and the interface file as a pair, since the interface file is in a sense a compiler-readable description of the contents of the object file. If the interface file and object file get out of sync for any reason, then the compiler may end up making assumptions about the object file that aren't true; trouble will almost certainly follow. For this reason, we recommend keeping object files and interface files in the same place (GHC does this by default, but it is possible to override the defaults as we'll explain shortly). Every module has a module name defined in its source code (module A.B.C where ...). Unless overridden with the -o and -ohi flags respectively, GHC always puts the object file for module A.B.C in odir/A/B/C.osuf, and the interface file in the file hidir/A/B/C.hisuf, where hidir, hisuf, odir, and osuf, defined as follows: hidir is the value of the option if one was given (), or root-path otherwise. hisuf is the value of the option if one was given (), or hi otherwise. odir is the value of the option if one was given (), or root-path otherwise. osuf is the value of the option if one was given (), or o otherwise (obj on Windows). The root-path, used in the above definitions, is derived from the location of the source file, source-filename, as follows: Rule 1 GHC matches source-filename against the pattern: root-path/A/B/C.extension where: extension is the source file extension (usually .hs or .lhs). root-path is what is left after A/B/C.extension has been stripped off the end of source-file. Rule 2 If source-filename does not match the pattern above (presumably because it doesn't finish with A/B/C.hs or A/B/C.lhs) then root-path becomes the whole of the directory portion of the filename. For example, if GHC compiles the module A.B.C in the file src/A/B/C.hs, with no -odir or -hidir flags, the interface file will be put in src/A/B/C.hi and the object file in src/A/B/C.o (using Rule 1). If the same module A.B.C was in file src/ABC.hs, the interface file will still be put in src/A/B/C.hi and the object file in src/A/B/C.o (using Rule 2). A common use for Rule 2 is to have many modules all called Main held in files Test1.hs Test2.hs, etc. Beware, though: when compiling (say) Test2.hs, GHC will consult Main.hi for version information from the last recompilation. Currently (a bug, really) GHC is not clever enough to spot that the source file has changed, and so there is a danger that the recompilation checker will declare that no recompilation is needed when in fact it is. Solution: delete the interface file first. Notice that (unless overriden with or ) the filenames of the object and interface files are always based on the module name. The reason for this is so that GHC can find the interface file for module A.B.C when compiling the declaration "import A.B.C". The search path search path interface files, finding them finding interface files In your program, you import a module Foo by saying import Foo. In mode or GHCi, GHC will look for a source file for Foo and arrange to compile it first. Without , GHC will look for the interface file for Foo, which should have been created by an earlier compilation of Foo. GHC uses the same strategy in each of these cases for finding the appropriate file. This strategy is as follows: GHC keeps a list of directories called the search path. For each of these directories, it tries appending basename.extension to the directory, and checks whether the file exists. The value of basename is the module name with dots replaced by the directory separator ('/' or '\', depending on the system), and extension is a source extension (hs, lhs) if we are in mode and GHCi, or hisuf otherwise. For example, suppose the search path contains directories d1, d2, and d3, and we are in --make mode looking for the source file for a module A.B.C. GHC will look in d1/A/B/C.hs, d1/A/B/C.lhs, d2/A/B/C.hs, and so on. The search path by default contains a single directory: . (i.e. the current directory). The following options can be used to add to or change the contents of the search path: This flag appends a colon-separated list of dirs to the search path. resets the search path back to nothing. This isn't the whole story: GHC also looks for modules in pre-compiled libraries, known as packages. See the section on packages (), for details. Redirecting the compilation output(s) output-directing options redirecting compilation output file GHC's compiled output normally goes into a .hc, .o, etc., file, depending on the last-run compilation phase. The option re-directs the output of that last-run phase to file. Note: this “feature” can be counterintuitive: ghc -C -o foo.o foo.hs will put the intermediate C code in the file foo.o, name notwithstanding! This option is most often used when creating an executable file, to set the filename of the executable. For example: ghc -o prog --make Main will compile the program starting with module Main and put the executable in the file prog. Note: on Windows, if the result is an executable file, the extension ".exe" is added if the specified filename does not already have an extension. Thus ghc -o foo Main.hs will compile and link the module Main.hs, and put the resulting executable in foo.exe (not foo). dir Redirects object files to directory dir. For example: $ ghc -c parse/Foo.hs parse/Bar.hs gurgle/Bumble.hs -odir `arch` The object files, Foo.o, Bar.o, and Bumble.o would be put into a subdirectory named after the architecture of the executing machine (x86, mips, etc). Note that the option does not affect where the interface files are put; use the option for that. In the above example, they would still be put in parse/Foo.hi, parse/Bar.hi, and gurgle/Bumble.hi. file The interface output may be directed to another file bar2/Wurble.iface with the option (not recommended). WARNING: if you redirect the interface file somewhere that GHC can't find it, then the recompilation checker may get confused (at the least, you won't get any recompilation avoidance). We recommend using a combination of and options instead, if possible. To avoid generating an interface at all, you could use this option to redirect the interface into the bit bucket: -ohi /dev/null, for example. dir Redirects all generated interface files into dir, instead of the default. suffix suffix suffix EXOTICA: The suffix will change the .o file suffix for object files to whatever you specify. We use this when compiling libraries, so that objects for the profiling versions of the libraries don't clobber the normal ones. Similarly, the suffix will change the .hi file suffix for non-system interface files (see ). Finally, the option suffix will change the .hc file suffix for compiler-generated intermediate C files. The / game is particularly useful if you want to compile a program both with and without profiling, in the same directory. You can say: ghc ... to get the ordinary version, and ghc ... -osuf prof.o -hisuf prof.hi -prof -auto-all to get the profiled version. Keeping Intermediate Files intermediate files, saving .hc files, saving .s files, saving The following options are useful for keeping certain intermediate files around, when normally GHC would throw these away after compilation: Keep intermediate .hc files when doing .hs-to-.o compilations via C (NOTE: .hc files aren't generated when using the native code generator, you may need to use to force them to be produced). Keep intermediate .s files. Keep intermediate .raw-s files. These are the direct output from the C compiler, before GHC does “assembly mangling” to produce the .s file. Again, these are not produced when using the native code generator. temporary files keeping Instructs the GHC driver not to delete any of its temporary files, which it normally keeps in /tmp (or possibly elsewhere; see ). Running GHC with will show you what temporary files were generated along the way. Redirecting temporary files temporary files redirecting If you have trouble because of running out of space in /tmp (or wherever your installation thinks temporary files should go), you may use the -tmpdir <dir> option option to specify an alternate directory. For example, says to put temporary files in the current working directory. Alternatively, use your TMPDIR environment variable.TMPDIR environment variable Set it to the name of the directory where temporary files should be put. GCC and other programs will honour the TMPDIR variable as well. Even better idea: Set the DEFAULT_TMPDIR make variable when building GHC, and never worry about TMPDIR again. (see the build documentation). Other options related to interface files interface files, options Dumps the new interface to standard output. The compiler does not overwrite an existing .hi interface file if the new one is the same as the old one; this is friendly to make. When an interface does change, it is often enlightening to be informed. The option will make GHC run diff on the old and new .hi files. Dump to the file "M.imports" (where M is the module being compiled) a "minimal" set of import declarations. You can safely replace all the import declarations in "M.hs" with those found in "M.imports". Why would you want to do that? Because the "minimal" imports (a) import everything explicitly, by name, and (b) import nothing that is not required. It can be quite painful to maintain this property by hand, so this flag is intended to reduce the labour. file Where file is the name of an interface file, dumps the contents of that interface in a human-readable (ish) format. The recompilation checker recompilation checker Turn off recompilation checking (which is on by default). Recompilation checking normally stops compilation early, leaving an existing .o file in place, if it can be determined that the module does not need to be recompiled. In the olden days, GHC compared the newly-generated .hi file with the previous version; if they were identical, it left the old one alone and didn't change its modification date. In consequence, importers of a module with an unchanged output .hi file were not recompiled. This doesn't work any more. Suppose module C imports module B, and B imports module A. So changes to A.hi should force a recompilation of C. And some changes to A (changing the definition of a function that appears in an inlining of a function exported by B, say) may conceivably not change B.hi one jot. So now… GHC keeps a version number on each interface file, and on each type signature within the interface file. It also keeps in every interface file a list of the version numbers of everything it used when it last compiled the file. If the source file's modification date is earlier than the .o file's date (i.e. the source hasn't changed since the file was last compiled), and the reompilation checking is on, GHC will be clever. It compares the version numbers on the things it needs this time with the version numbers on the things it needed last time (gleaned from the interface file of the module being compiled); if they are all the same it stops compiling rather early in the process saying “Compilation IS NOT required”. What a beautiful sight! Patrick Sansom had a workshop paper about how all this is done (though the details have changed quite a bit). Ask him if you want a copy. Using <command>make</command> make It is reasonably straightforward to set up a Makefile to use with GHC, assuming you name your source files the same as your modules. Thus: HC = ghc HC_OPTS = -cpp $(EXTRA_HC_OPTS) SRCS = Main.lhs Foo.lhs Bar.lhs OBJS = Main.o Foo.o Bar.o .SUFFIXES : .o .hs .hi .lhs .hc .s cool_pgm : $(OBJS) rm -f $@ $(HC) -o $@ $(HC_OPTS) $(OBJS) # Standard suffix rules .o.hi: @: .lhs.o: $(HC) -c $< $(HC_OPTS) .hs.o: $(HC) -c $< $(HC_OPTS) # Inter-module dependencies Foo.o Foo.hc Foo.s : Baz.hi # Foo imports Baz Main.o Main.hc Main.s : Foo.hi Baz.hi # Main imports Foo and Baz (Sophisticated make variants may achieve some of the above more elegantly. Notably, gmake's pattern rules let you write the more comprehensible: %.o : %.lhs $(HC) -c $< $(HC_OPTS) What we've shown should work with any make.) Note the cheesy .o.hi rule: It records the dependency of the interface (.hi) file on the source. The rule says a .hi file can be made from a .o file by doing…nothing. Which is true. Note the inter-module dependencies at the end of the Makefile, which take the form Foo.o Foo.hc Foo.s : Baz.hi # Foo imports Baz They tell make that if any of Foo.o, Foo.hc or Foo.s have an earlier modification date than Baz.hi, then the out-of-date file must be brought up to date. To bring it up to date, make looks for a rule to do so; one of the preceding suffix rules does the job nicely. Dependency generation dependencies in Makefiles Makefile dependencies Putting inter-dependencies of the form Foo.o : Bar.hi into your Makefile by hand is rather error-prone. Don't worry, GHC has support for automatically generating the required dependencies. Add the following to your Makefile: depend : ghc -M $(HC_OPTS) $(SRCS) Now, before you start compiling, and any time you change the imports in your program, do make depend before you do make cool_pgm. ghc -M will append the needed dependencies to your Makefile. In general, if module A contains the line import B ...blah... then ghc -M will generate a dependency line of the form: A.o : B.hi If module A contains the line import {-# SOURCE #-} B ...blah... then ghc -M will generate a dependency line of the form: A.o : B.hi-boot (See for details of hi-boot style interface files.) If A imports multiple modules, then there will be multiple lines with A.o as the target. By default, ghc -M generates all the dependencies, and then concatenates them onto the end of makefile (or Makefile if makefile doesn't exist) bracketed by the lines "# DO NOT DELETE: Beginning of Haskell dependencies" and "# DO NOT DELETE: End of Haskell dependencies". If these lines already exist in the makefile, then the old dependencies are deleted first. Don't forget to use the same options on the ghc -M command line as you would when compiling; this enables the dependency generator to locate any imported modules that come from packages. The package modules won't be included in the dependencies generated, though (but see the option below). The dependency generation phase of GHC can take some additional options, which you may find useful. For historical reasons, each option passed to the dependency generator from the GHC command line must be preceded by -optdep. For example, to pass -f .depend to the dependency generator, you say ghc -M -optdep-f -optdep.depend ... The options which affect dependency generation are: Turn off warnings about interface file shadowing. file Use file as the makefile, rather than makefile or Makefile. If file doesn't exist, mkdependHS creates it. We often use to put the dependencies in .depend and then include the file .depend into Makefile. Use .<osuf> as the "target file" suffix ( default: o). Multiple flags are permitted (GHC2.05 onwards). Thus "" will generate dependencies for .hc and .o files. Make extra dependencies that declare that files with suffix .<suf>_<osuf> depend on interface files with suffix .<suf>_hi, or (for {-# SOURCE #-} imports) on .hi-boot. Multiple flags are permitted. For example, will make dependencies for .hc on .hi, .a_hc on .a_hi, and .b_hc on .b_hi. (Useful in conjunction with NoFib "ways".) Regard <file> as "stable"; i.e., exclude it from having dependencies on it. same as Regard the colon-separated list of directories <dirs> as containing stable, don't generate any dependencies on modules therein. Regard <file> as not "stable"; i.e., generate dependencies on it (if any). This option is normally used in conjunction with the option. Regard modules imported from packages as unstable, i.e., generate dependencies on the package modules used (including Prelude, and all other standard Haskell libraries). This option is normally only used by the various system libraries. How to compile mutually recursive modules module system, recursion recursion, between modules Currently, the compiler does not have proper support for dealing with mutually recursive modules: module A where import B newtype TA = MkTA Int f :: TB -> TA f (MkTB x) = MkTA x -------- module B where import A data TB = MkTB !Int g :: TA -> TB g (MkTA x) = MkTB x When compiling either module A and B, the compiler will try (in vain) to look for the interface file of the other. So, to get mutually recursive modules off the ground, you need to hand write an interface file for A or B, so as to break the loop. These hand-written interface files are called hi-boot files, and are placed in a file called <module>.hi-boot. To import from an hi-boot file instead of the standard .hi file, use the following syntax in the importing module: hi-boot files importing, hi-boot files import {-# SOURCE #-} A The hand-written interface need only contain the bare minimum of information needed to get the bootstrapping process started. For example, it doesn't need to contain declarations for everything that module A exports, only the things required by the module that imports A recursively. For the example at hand, the boot interface file for A would look like the following: module A where newtype TA = MkTA GHC.Base.Int The syntax is similar to a normal Haskell source file, but with some important differences: Non-local entities must be qualified with their original defining module. Qualifying by a module which just re-exports the entity won't do. In particular, most Prelude entities aren't actually defined in the Prelude (see for example GHC.Base.Int in the above example). HINT: to find out the fully-qualified name for entities in the Prelude (or anywhere for that matter), try using GHCi's :info command, eg. Prelude> :m -Prelude > :i IO.IO -- GHC.IOBase.IO is a type constructor newtype GHC.IOBase.IO a ... Only data, type, newtype, class, and type signature declarations may be included. You cannot declare instances or derive them automatically. Notice that we only put the declaration for the newtype TA in the hi-boot file, not the signature for f, since f isn't used by B. If you want an hi-boot file to export a data type, but you don't want to give its constructors (because the constructors aren't used by the SOURCE-importing module), you can write simply: module A where data TA (You must write all the type parameters, but leave out the '=' and everything that follows it.) Orphan modules and instance declarations Haskell specifies that when compiling module M, any instance declaration in any module "below" M is visible. (Module A is "below" M if A is imported directly by M, or if A is below a module that M imports directly.) In principle, GHC must therefore read the interface files of every module below M, just in case they contain an instance declaration that matters to M. This would be a disaster in practice, so GHC tries to be clever. In particular, if an instance declaration is in the same module as the definition of any type or class mentioned in the head of the instance declaration, then GHC has to visit that interface file anyway. Example: module A where instance C a => D (T a) where ... data T a = ... The instance declaration is only relevant if the type T is in use, and if so, GHC will have visited A's interface file to find T's definition. The only problem comes when a module contains an instance declaration and GHC has no other reason for visiting the module. Example: module Orphan where instance C a => D (T a) where ... class C a where ... Here, neither D nor T is declared in module Orphan. We call such modules ``orphan modules'', defined thus: An orphan module orphan module contains at least one orphan instance or at least one orphan rule. An instance declaration in a module M is an orphan instance if orphan instance none of the type constructors or classes mentioned in the instance head (the part after the ``=>'') are declared in M. Only the instance head counts. In the example above, it is not good enough for C's declaration to be in module A; it must be the declaration of D or T. A rewrite rule in a module M is an orphan rule orphan rule if none of the variables, type constructors, or classes that are free in the left hand side of the rule are declared in M. GHC identifies orphan modules, and visits the interface file of every orphan module below the module being compiled. This is usually wasted work, but there is no avoiding it. You should therefore do your best to have as few orphan modules as possible. You can identify an orphan module by looking in its interface file, M.hi, using the . If there is a ``!'' on the first line, GHC considers it an orphan module.