I will be defending my doctoral dissertation “Evidence-based programming language design: a philosophical and methodological exploration” on December 4, 2015 at noon, in the Seminarium building, auditorium S212, of the University of Jyväskylä. My opponent will be Professor Lutz Prechelt (Freie Universität Berlin, Germany), and the custos is Professor Tommi Kärkkäinen (University of Jyväskylä).
The defense is public; anyone may come. Dress code for the audience is whatever one would wear to any lecture or regular academic activity at the university (no formal dress required). There is a Facebook event page.
The dissertation manuscript was reviewed (for a permission to publish and defend) by Professor Matthias Felleisen (Northeastern University, USA) and Professor Andreas Stefik (University of Nevada, Las Vegas, USA). The dissertation incorporates most of my licentiate thesis, which was examined last year by Doctor Stefan Hanenberg (University of Duisburg-Essen, Germany) and Professor Stein Krogdahl (University of Oslo, Norway).
The dissertation mentions Haskell in several places, although that is not its main focus.
Evidence-Based Programming Language Design. A Philosophical and Methodological Exploration.
Jyväskylä: University of Jyväskylä, 2015, 256 p.
(Jyväskylä Studies in Computing
ISSN 1456-5390; 222)
ISBN 978-951-39-6387-3 (nid.)
ISBN 978-951-39-6388-0 (PDF)
Background: Programming language design is not usually informed by empirical studies. In other fields similar problems have inspired an evidence-based paradigm of practice. Such a paradigm is practically inevitable in language design, as well. Aims: The content of evidence-based programming design (EB-PLD) is explored, as is the concept of evidence in general. Additionally, the extent of evidence potentially useful for EB-PLD is mapped, and the appropriateness of Cohen’s kappa for evaluating coder agreement in a secondary study is evaluated. Method: Philosophical analysis and explication are used to clarify the unclear. A systematic mapping study was conducted to map out the existing body of evidence. Results: Evidence is a report of observations that affects the strength of an argument. There is some but not much evidence. EB-PLD is a five-step process for resolving uncertainty about design problems. Cohen’s kappa is inappropriate for coder agreement evaluation in systematic secondary studies. Conclusions: Coder agreement evaluation should use Scott’s pi, Fleiss’ kappa, or Krippendorff’s alpha. EB-PLD is worthy of further research, although its usefulness was out of scope here.
Yesterday I received my official diploma for the degree of Licentiate of Philosophy. The degree lies between a Master’s degree and a doctorate, and is not required; it consists of the coursework required for a doctorate, and a Licentiate Thesis, “in which the student demonstrates good conversance with the field of research and the capability of independently and critically applying scientific research methods” (official translation of the Government decree on university degrees 794/2004, Section 23 Paragraph 2).
The title and abstract of my Licentiate Thesis follow:
The extent of empirical evidence that could inform evidence-based design of programming languages. A systematic mapping study.
Jyväskylä: University of Jyväskylä, 2014, 243 p.
(Jyväskylä Licentiate Theses in Computing,
ISSN 1795-9713; 18)
ISBN 978-951-39-5790-2 (nid.)
ISBN 978-951-39-5791-9 (PDF)
Background: Programming language design is not usually informed by empirical studies. In other fields similar problems have inspired an evidence-based paradigm of practice. Central to it are secondary studies summarizing and consolidating the research literature. Aims: This systematic mapping study looks for empirical research that could inform evidence-based design of programming languages. Method: Manual and keyword-based searches were performed, as was a single round of snowballing. There were 2056 potentially relevant publications, of which 180 were selected for inclusion, because they reported empirical evidence on the efficacy of potential design decisions and were published on or before 2012. A thematic synthesis was created. Results: Included studies span four decades, but activity has been sparse until the last five years or so. The form of conditional statements and loops, as well as the choice between static and dynamic typing have all been studied empirically for efficacy in at least five studies each. Error proneness, programming comprehension, and human effort are the most common forms of efficacy studied. Experimenting with programmer participants is the most popular method. Conclusions: There clearly are language design decisions for which empirical evidence regarding efficacy exists; they may be of some use to language designers, and several of them may be ripe for systematic reviewing. There is concern that the lack of interest generated by studies in this topic area until the recent surge of activity may indicate serious issues in their research approach.
Keywords: programming languages, programming language design, evidence-based paradigm, efficacy, research methods, systematic mapping study, thematic synthesis
A Licentiate Thesis is assessed by two examiners, usually drawn from outside of the home university; they write (either jointly or separately) a substantiated statement about the thesis, in which they suggest a grade. The final grade is almost always the one suggested by the examiners. I was very fortunate to have such prominent scientists as Dr. Stefan Hanenberg and Prof. Stein Krogdahl as the examiners of my thesis. They recommended, and I received, the grade “very good” (4 on a scale of 1–5).
The thesis has been accepted for publication in our faculty’s licentiate thesis series and will in due course appear in our university’s electronic database (along with a very small number of printed copies). In the mean time, if anyone wants an electronic preprint, send me email at firstname.lastname@example.org.
As you can imagine, the last couple of months in the spring were very stressful for me, as I pressed on to submit this thesis. After submission, it took me nearly two months to recover (which certain people who emailed me on Planet Haskell business during that period certainly noticed). It represents the fruit of almost four years of work (way more than normally is taken to complete a Licentiate Thesis, but never mind that), as I designed this study in Fall 2010.
Recently, I have been writing in my blog a series of posts in which I have been trying to clear my head about certain foundational issues that irritated me during the writing of the thesis. The thesis contains some of that, but that part of it is not very strong, as my examiners put it, for various reasons. The posts have been a deliberately non-academic attempt to shape the thoughts into words, to see what they look like fixed into a tangible form. (If you go read them, be warned: many of them are deliberately provocative, and many of them are intended as tentative in fact if not in phrasing; the series also is very incomplete at this time.)
In fact, the whole of 20th Century philosophy of science is a big pile of failed attempts to explain science; not one explanation is fully satisfactory. […] Most scientists enjoy not pondering it, for it’s a bit like being a cartoon character: so long as you don’t look down, you can walk on air.
I wrote my Master’s Thesis (PDF) in 2002. It was about the formal method called “B”; but I took a lot of time and pages to examine the history and content of formal logic. My supervisor was, understandably, exasperated, but I did receive the highest possible grade for it (which I never have fully accepted I deserved). The main reason for that digression: I looked down, and I just had to go poke the bridge I was standing on to make sure I was not, in fact, walking on air. In the many years since, I’ve taken a lot of time to study foundations, first of mathematics, and more recently of science. It is one reason it took me about eight years to come up with a doable doctoral project (and I am still amazed that my department kept employing me; but I suppose they like my teaching, as do I). The other reason was, it took me that long to realize how to study the design of programming languages without going where everyone has gone before.
Debian people, if any are still reading, may find it interesting that I found significant use for the dctrl-tools toolset I have been writing for Debian for about fifteen years: I stored my data collection as a big pile of dctrl-format files. I ended up making some changes to the existing tools (I should upload the new version soon, I suppose), and I wrote another toolset (unfortunately one that is not general purpose, like the dctrl-tools are) in the process.
For the Haskell people, I mainly have an apology for not attending to Planet Haskell duties in the summer; but I am back in business now. I also note, somewhat to my regret, that I found very few studies dealing with Haskell. I just checked; I mention Haskell several times in the background chapter, but it is not mentioned in the results chapter (because there were not studies worthy of special notice).
I am already working on extending this work into a doctoral thesis. I expect, and hope, to complete that one faster.