%include paper.fmt

\chapter{Introduction}
\label{chp:introduction}

This thesis is concerned with functional programming. Throughout the thesis all examples and implementations are presented in Haskell \cite{haskell}. Much of this work takes advantage of the purity of Haskell, and some requires lazy evaluation, but many of the ideas should be applicable to other functional languages.

In this chapter, we first discuss the motivation underlying the problems we have tackled in \S\ref{secI:motivation}. Next we provide details of where to obtain implementations related to this thesis in \S\ref{secI:implementation}, followed by a description of each of the following chapters in \S\ref{secI:chapters}.


\section{Motivation and Objectives}
\label{secI:motivation}

This thesis has three main objectives: making functional programs shorter, faster and safer. This section explains the particular aims within each area, and how the areas are related. We present the motivation for the objectives in reverse order, being the order we tackled them, to show how each motivates the next.

\subsection{Making Programs Safer}

Haskell is a strongly typed language, ensuring that a large class of errors are caught at compile time. Despite all the guarantees that the type system provides, programs may still fail in three ways:

\begin{description}
\item[Wrong Behaviour] Detecting incorrect behaviour requires the programmer to provide annotations describing the desired behaviour. Mandatory annotations increase the effort required to make use of a tool, and therefore reduce the potential number of users.
\item[Non-termination] The issue of non-termination has been investigated extensively -- one particularly impressive tool is the AProVE framework \cite{aprove_haskell}.
\item[Calling |error|] The final cause of failure is calling |error|, often as the result of an incomplete pattern-match. This issue has not received as much attention, with suggestions that programmers only use exhaustive patterns \cite{turner:total}, or local analysis to decide which patterns are exhaustive \cite{maranget:pattern_warnings}. The problem of calling |error| is a practical one, with such failures being a common occurrence when developing a Haskell program.
\end{description}

In order to make programs safer, we have developed the Catch tool, which ensures a program does not call |error|. We decided to make our analysis \textit{conservative} -- if it reports that a program will not call |error|, then the program is guaranteed not to call |error|. We require no annotations from the programmer.

The Catch tool operates on a first-order language. We attempted to extend Catch to a higher-order language, but failed. A higher-order program has more complicated flow-control, which causes problems for Catch. In order to apply Catch to all Haskell programs, we have investigated defunctionalisation -- converting a higher-order program to a first-order program. Our defunctionalisation method is called Firstify, and uses well-known transformations, particularly specialisation and inlining, applied in particular ways. The defunctionalisation method is designed to be used as a transformation before analysis, primarily for Catch, but can be used independently.

\subsection{Making Programs Faster}

After making a program first-order, it can often execute faster than before. As we explored this aspect of defunctionalisation, we were drawn towards other optimisation techniques -- in particular supercompilation \cite{supercompilation}. Just as defunctionalisation often leads to improved performance, so supercompilation often leads to the removal of higher-order values. We attempted to construct a defunctionalisation method by restricting supercompilation, but the result was not very successful. However, we did enhance our defunctionalisation method using techniques from supercompilation, particularly the termination criteria.

We have developed a supercompiler named Supero. Our work on supercompilation aims to allow Haskell programs to be written in a high-level style, yet perform competitively. Often, to obtain high performance, Haskell programmers are forced to make use of low-level features such as unboxed types \cite{spj:unboxing}, provide additional annotations such as rewrite rules \cite{spj:rules} and express programs in an unnatural style, such as using |foldr| to obtain deforestation \cite{gill:shortcut_deforestation}. Supero can optimise Haskell programs, providing substantial speed-ups in some cases. Like Catch, Supero requires no annotations from the programmer.

\subsection{Making Programs Shorter}

Our final contribution is the Uniplate library. The expression type of the Core language we work with has over ten constructors. Most of these constructors contain embedded subexpressions. For most operations, we wish to have value-specific behaviour for a handful of constructors, and a default operation for the others. We started developing a small library of useful functions to deal with this complexity, and gradually abstracted the ideas. After refinement, the Uniplate library emerged. The library is particularly focused on concisely expressing common patterns. Compared to other work on generic programming patterns, such as SYB \cite{lammel:syb} and Compos \cite{bringert:compos}, the Uniplate library makes use of fewer language extensions and permits more concise operations.

The Uniplate library stands apart from the rest of the thesis in that it does not work on a core functional language, but is instead a general purpose library. However, the Uniplate techniques have been invaluable in implementing the other transformations.


\section{Implementations}
\label{secI:implementation}

We have implemented all the ideas presented in this thesis, and include sample code in the related chapters. Most of our implementations make use of a monadic framework to deal with issues such as obtaining unique free variables and tracking termination constraints. But to simplify the presentation, we ignore these issues -- they are mostly tedious engineering concerns, and do not effect the underlying algorithms.

All the code is available from the author's homepage\footnote{\url{http://www.cs.york.ac.uk/~ndm/}}. Additionally, we have released the following packages on the Hackage website\footnote{\url{http://hackage.haskell.org/}}:

\begin{description}
\item[Homeomorphic] This is a library for testing for homeomorphic embedding, used to ensure termination, as described in \S\ref{secB:homeomorphic}.
\item[Uniplate] This is the library described in Chapter \ref{chp:uniplate}.
\item[Derive] This tool can generate Uniplate instances, and is mentioned in \S\ref{secU:derive}.
\item[Yhc.Core] This is a library providing the data type for Yhc's Core language. It requires Uniplate to implement some of the functions.
\item[Supero] This is the program described in Chapter \ref{chp:supero}. It requires Yhc.Core as the Core language to operate on, Homeomorphic to ensure termination and Uniplate for various transformations.
\item[Firstify] This is the library described in Chapter \ref{chp:firstify}. Like Supero, this library requires Yhc.Core, Homeomorphic and Uniplate.
\item[Proposition] This is the proposition library described in \ref{chp:catch}, particularly Figure \ref{figC:prop}.
\item[Catch] This is the program described in Chapter \ref{chp:catch}. This library requires Proposition, and the Firstify library and all its dependencies.
\end{description}

\section{Chapter Outline}
\label{secI:chapters}

The Background chapter (\ref{chp:background}) describes a common Core language which is used in the subsequent chapters. It also describes the homeomorphic embedding relation, used to ensure termination in a number of transformations.

The Boilerplate Removal chapter (\ref{chp:uniplate}) describes the Uniplate library. In particular, it describes the interface to the library -- both the traversal functions and the information a data type must provide. It also compares the Uniplate library to the Scrap Your Boilerplate (SYB) library \cite{lammel:syb} and the Compos library \cite{bringert:compos} -- both in terms of speed and conciseness.

The Supercompilation chapter (\ref{chp:supero}) describes the design and implementation of the Supero tool. The method includes techniques for dealing with let bindings, and a new method for generalisation. Results are presented comparing a combination of Supero and the Glasgow Haskell Compiler (GHC) \cite{ghc} to C, and comparing Supero and GHC to GHC alone.

The Defunctionalisation chapter (\ref{chp:firstify}) describes how to combine several existing transformations to produce a defunctionalisation method. The main focus is how to restrict the existing methods to ensure they terminate and cooperate to obtain a program with few residual functional values.

The Pattern-Match Analysis chapter (\ref{chp:catch}) describes the implementation of the Catch tool. It presents a mechanism for reasoning about programs using a constraint language, along with two alternative constraint languages. The Catch tool is tested on a number of benchmark programs, and for several larger programs.

The Conclusions chapter (\ref{chp:conclusions}) gives directions for future work, and makes concluding remarks.

The Soundness of Pattern-Match Analysis Appendix (\ref{chp:proof}) provides a soundness proof of the algorithms presented in Chapter \ref{chp:catch}.

The Function Index Appendix (\ref{chp:index}) is an index of most of the Haskell functions used in the thesis, both those defined in the thesis and those from the standard Haskell libraries.

