Mayday mayday Estonia please…

MV Estonia

Taisin olla koulun ruokalassa syömässä aamupäivällä kun kuulin muiden puhuvan jostain kamalasta tapahtumasta. En ollut seurannut uutisia, joten kysyin, mistä oli kyse. Kuulin, että ruotsinlaiva oli uponnut, paljon kuolleita.

En oikeastaan muista itse tapahtumapäivästä enempää. Matkustin toisinaan itsekin Tukholmaan sukuloimaan ruotsinlaivalla, ja olen varmasti myös Estonialla (silloin kun se oli vielä Viking Sally tai Silja Star) kulkenut. Sen muistan elävästi, kun olin ekaa kertaa Estonian uppoamisen jälkeen laivassa. Olimme isäni kanssa kahdestaan reissussa muistaakseni Siljalla Turusta Tukholmaan, ja luonnollisesti laivoilla rauhoiteltiin paljon matkustajia. Laivalla järjestettiin mm. tiedotustilaisuus halukkaille matkustajille laivan turvallisuusjärjestelmistä. Opin silloin, kuinka ruotsinlaiva evakuoidaan. Opin myös, että Turku–Tukholma-välillä Estonian kaltaista onnettomuutta tuskin voisi syntyä, koska rantaa oli lähes koko matkan lähellä (tosin näin jälkikäteen sanoen, niin oli Costa Concordiallakin).

Aikuisiällä kiinnostuin fiktion kirjoittamisesta. Yksi aihe, jota tässä mielessä tutkiskelin, oli erilaiset katastrofit ja kuinka organisaatiot kohtaavat ne. Luin Hiroshimasta. Tutustuin merenkulun ja ilmaliikenteen sääntöihin ja erityisesti radioviestinnän fraseologiaan. Katselin Lentoturmatutkinta ja Hetki ennen tuhoa -sarjoja. Kuuntelin Estonian hätäkutsun.

“Mayday mayday Estonia please” ei ole fraseologian mukainen hätäilmoitus, mutta se syöpyi mieleeni, ja muistan sen varmaan koko elämäni. Eipä tuo hätäliikenne muutenkaan ollut tuolloin ohjeiden mukainen, mutta tärkein toteutui: hätäviesti kuultiin ja Estonian sijainti selvisi. Sen takia niinkin moni selvisi hengissä. Toisaalta moni muu asia meni pieleen, enkä puhu pelkästään keulaportin hajoamisesta.

Nykyään kun menen laivaan, aikaisessa vaiheessa tutustun evakuointireittiin paitsi kartalta myös etsien kulkureitit ja pelastusaseman. Tarvitsen ikkunallisen hytin, jotta voin yöllä katsoa ulos ja rauhoitella itseäni painajaisesta herättyäni. Lentokoneessa kuuntelen turvallisuusohjeet tarkasti. Oma työni yliopistolla ei ole kovin riskialtis, mutta olen silti tehnyt mielikuvaharjoituksia siitä, miten pitää toimia, jos opetukseni aikana tulee esimerkiksi palohälytys; minähän olen silloin vastuussa salin evakuoinnista. Kolmen vuoden takainen asemiesepäily Mattilanniemen kampuksella on myös johtanut minut miettimään, mitä tekisin vastaavassa tositilanteessa. Luen useimmat Suomessa julkaistut turvallisuustutkimusraportit, ja pohdin, mitä niistä voisi oppia. Toivottavasti nämä varotoimenpiteet jäävät kohdallani turhiksi, mutta kuten eräs Estonian turmasta selvinnyt matkustaja sanoi:

Olin valmistautunut onnettomuuteen, ja onnettomuus tuli. Se ei silti tarkoita yhtikäs mitään. Kyse on sattumasta ja tilastoista. Monet ihmiset katsovat pelastusreitit. Suurin osa heistä ei koskaan joudu käyttämään niitä. Johdatuksen sijaan puhuisin mieluummin lottoamisesta. Jos haluaa voittaa lotossa, ensimmäiseksi on pistettävä kuponki sisään. Olin lotonnut, ja Estonian kohdalla rivi täsmäsi.

Miten Estonian onnettomuus on vaikuttanut sinuun?

Näyttöön perustuvan ohjelmointikielten suunnittelun tueksi sopivan empiirisen tutkimusnäytön laajuus: Järjestelmällinen kirjallisuuskartoitus

(Tämä on englanninkielisen lisensiaatintutkimukseni suomenkielinen yhteenveto.)

Ohjelmointikieliä on tuhansittain, ja niitä luodaan lisää (ja olemassa olevia kieliä muokataan) jatkuvasti. Tämä luonti- ja kehitystyö perustuu yleensä laatijoiden ja kehittäjien omaan tyylitajuun, henkilökohtaisiin mieltymyksiin sekä teoreettiseen tietämykseen. Empiiristä tutkimustietoa ohjelmointikielten ja niiden muutosten hyödyllisyydestä ei käytetä juuri lainkaan. Ohjelmoinnin psykologian tutkimus on kuitenkin yli neljäkymmentä vuotta vanha tieteenala, ja siitä luulisi olevan hyötyä ohjelmointikielten laatijoille ja kehittäjille.

Tuleville lääkäreille on jo useampi vuosikymmen opetettu näyttöön perustuvan lääketieteen mallia: jos lääkäri ei ole varma, miten tulisi toimia jonkin tietyn potilaan ongelman kanssa, ensiksi hän muotoilee vastattavissa olevan kysymyksen; toiseksi hän etsii tutkimuskirjallisuudesta ja siihen perustuvista toisiolähteistä tutkimusnäyttöä, joka vastaa kyseiseen kysymykseen; kolmanneksi hän arvioi tuon näytön luotettavuuden; neljänneksi hän soveltaa tuon tutkimusnäytön antamaa vastausta potilaansa ongelmaan; ja viidenneksi arvioi omaa suoriutumistaan tässä prosessissa. Tämä lääketieteestä peräisin oleva toimintamalli on sittemmin otettu soveltuvin osin käyttöön myös monilla muilla asiantuntijuuteen perustuvilla aloilla, muiden muassa ohjelmistotekniikassa.

Tämän lisensiaatintyöni lähtökohtana oli näyttöön perustuvan ohjelmointikielten suunnittelun idea. Työn tarkoituksena oli selvittää, kuinka paljon sellaista empiiristä tutkimusnäyttöä on olemassa, josta voisi olla hyötyä ohjelmointikielten suunnittelijoille. Keskityin tarkastelemaan tutkimuksia, jotka pyrkivät vertailemaan kahden tai useamman vaihtoehtoisen suunnitteluratkaisun hyödyllisyyttä ohjelmoijan näkökulmasta. Halusin selvittää lisäksi, mitä tällaisia suunnitteluratkaisuja on tutkittu tällä tavalla, millä eri tavoin hyödyllisyys on ymmärretty tällaisissa tutkimuksissa, sekä mitä tutkimusmenetelmiä tällaisissa tutkimuksissa on käytetty.

Tämä lisensiaatintyöni on kirjallisuuteen perustuva tutkimus, niin sanottu toisiotutkimus, jossa aineistona käytetään ensiötutkimuksia eli tutkimuksia, joissa tutkijat ovat itse välittömästi havainnoineet tutkittavaa ilmiötä. Useimmat järjestelmälliset toisiotutkimukset kuuluvat kahteen pääluokkaan. Järjestelmälliset katsaukset pyrkivät vastaamaan käytännön toiminnan kannalta oleellisiin, hyvin tarkkarajaisiin kysymyksiin. Järjestelmälliset kartoitukset puolestaan pyrkivät hahmottamaan tutkimuskirjallisuuden yleisen tilanteen jollakin tutkimusalalla. Tämä työni on selkeästi kartoitus.

Olen taustoittamisen tarkoituksessa käsitellyt tässä työssäni ohjelmointikielten erilaisia luokitteluja (kielten tasot, sukupolvet ja paradigmat), kielten käsitteellistä rakennetta, tiettyjen suunnitteluratkaisujen historiaa sekä ohjelmointikielten kehitystyön historiaa. Lisäksi olen työssäni suhteellisen laajasti referoinut ohjelmistotekniikan alalla julkaistuja systemaattisten kirjallisuuskartoitusten tutkimusmetodologisia toimintaohjeita. Työni sisältää myös ohjelmointikielen käsitteen analyysiä sekä näytön käsitteen tietoteoreettista pohdintaa.

Itse kartoituksen lähdemateriaalin etsin useita eri hakumenetelmiä käyttäen. Ensiksi selasin läpi eräiden kansainvälisten tutkimuslehtien ja konferenssijulkaisujen kaikki numerot (käyttäen hyväksi tietoverkossa julkaistuja sisällysluetteloja ja abstrakteja). Seuraavaksi tein avainsanahakuja useissa kansainvälisesti tunnetuissa tutkimuskirjallisuustietokannoissa. Lopuksi etsin lisälähteitä kaikkien edellisillä hauilla löytyneiden kartoitukseeni hyväksymieni tutkimusjulkaisuiden lähdeluetteloista sekä eräiden tietokantojen luetteloista näihin julkaisuihin viittaavista julkaisuista; tätä kutsun jatkossa lumipallohauksi.

Hauilla löytyneet julkaisut kävin läpi kolmessa kierroksessa. Ensimmäisellä kierroksella hylkäsin tutkimukseni kannalta ilmiselvästi epäolennaiset julkaisut. Toisella kierroksella hylkäsin ne julkaisut, joiden epäolennaisuudesta olin vakuuttunut. Näillä kahdella kierroksella päätökseni perustuivat tietoverkosta saataviin metatietoihin. Kolmatta kierrosta varten hankin jokaisesta vielä jäljellä olevasta julkaisusta sen koko tekstisisällön, joko paperilla tai sähköisesti. Tällä kierroksella hylkäsin ne, joiden epäolennaisuudesta vakuutuin; loput otin mukaan tähän tutkimukseen. Valintojen oikeellisuuden selvittämiseksi lisensiaatintyöni ohjaajat tekivät kukin pienelle osalle löytyneistä julkaisuista satunnaisotannalla itsenäisen hyväksymis- tai hylkäyspäätöksen. Olimme pääosin samaa mieltä; erimielisyydet ratkaisimme lopullisesti konsensuspäätöksellä.

Mukaan kartoitukseen otin ne ensiö- ja toisiotutkimukset, jotka pyrkivät selvittämään jonkin ohjelmointikielten suunnitteluratkaisun hyödyllisyyden ohjelmoijan näkökulmasta, joista oli saatavilla täydellinen, viimeistään vuonna 2012 julkaistu tutkimusraportti englannin, suomen tai ruotsin kielellä ja jotka esittivät empiiristä tutkimusnäyttöä väitteittensä tueksi.

Selaamalla löytyi 1515 ensimmäisen kierroksen hyväksymää julkaisua, avainsanahauilla löytyi 248 lisää ja lumipallohaulla vielä 293 julkaisua näiden lisäksi. Toisella kierroksella jäljelle jäi 1045 selaamalla löytynyttä, 151 avainsanahauilla löytynyttä ja 223 lumipallohaun löytämää. Lopullisesti kartoitukseen hyväksyttiin 180 tutkimusjulkaisua, jotka raportoivat 137 ensiötutkimusta. Toisiotutkimuksia julkaisuissa raportoitiin 19. Varsinaisessa kartoituksessa olen käsitellyt vain ensiötutkimuksia.

Tein tutkimukseen mukaan otetuista tutkimusjulkaisuista temaattisen synteesin seuraavasti. Ensiksi luin kaikki mukaan otetut julkaisut läpi. Seuraavaksi valitsin jokaisesta suoria lainauksia, jotka liittyivät tutkimukseni aiheeseen. Tämän jälkeen koodasin lainaukset (eli annoin niille kuvaavia avainsanoja). Koodien perusteella etsin aineistosta esille nousevia, tutkimukseni aiheen kannalta merkittäviä teemoja, joista lopulta rakensin temaattisen mallin. Koodauksen oikeellisuuden arvioimiseksi yksi ohjaajistani koodasi muutaman artikkelin uudestaan; ratkaisumme erosivat jonkin verran toisistaan.

Temaattinen mallini jakoi kartoitukseen mukaan ottamani ensiötutkimukset kahteen luokkaan. Reuna-alueeseen kuuluivat tutkimukset, jotka eivät olleet kovin oleellisia kartoitukseni kannalta: ne vain vertailivat kieliä tai kieliluokkia toisiinsa taikka käyttivät yksittäisiä olemassa olevia ohjelmia tai ohjelmointitehtäviä jonkin teknologian käyttökelpoisuuden osoittamiseen. Loput 65 tutkimusta muodostivat ytimen, joka puolestaan jakautui sipulimaisesti useaan kerrokseen käytetyn tutkimusmenetelmän mukaan.

Ydinsipulin uloin kerros koostui tutkimuksista, joissa ei käytetty minkäänlaista koeasetelmaa; tyypillisesti kyse oli määrällisestä havainnoivasta tutkimuksesta taikka laadullisesta tutkimuksesta. Seuraavaksi uloin kerros koostui kokeista eli tutkimuksista, joissa tutkijat ovat pyrkineet vaikuttamaan tutkimustilanteeseen siten, että tästä aiheutuva muutos tulosmittareissa on havaittavissa. Seuraava, toiseksi sisin, kerros koostui kontrolloiduista kokeista eli tutkimuksista, joissa koehenkilöt tai muut tutkimuskohteet on jaettu ryhmiin sen mukaan, mitä tutkimuksessa mukana olevaa suunnitteluratkaisua he käyttävät tai missä järjestyksessä he käyttävät mukana olevia suunnitteluratkaisua. Ydinsipulin sisin kerros eli sydän koostui satunnaistetuista kontrolloiduista kokeista eli kontrolloiduista kokeista, joissa koehenkilöt tai muut tutkimuskohteet on jaettu ryhmiin jollakin satunnaisprosessilla. Sipulin sydämessä oli 22 tutkimusta.

Tutkimusten julkaisuajoista oli havaittavissa mielenkiintoinen ilmiö. Vanhin kartoituksessa mukana ollut julkaisu oli julkaistu 1973 ja uusin vuonna 2012 (koska uudempia en ottanut kartoitukseen mukaan). Aina vuosituhannen vaihteen paikkeille asti tutkimuksia julkaistiin suunnilleen saman verran joka vuosi, mutta määrät nousivat vuosituhannen vaihteen paikkeilla ja uudestaan dramaattisesti vuoden 2008 paikkeilla. Vastaava ilmiö on havaittavissa, joskin heikompana, kaikissa ydinsipulin kerroksissa.

Kartoituksessa havaitsin, että ohjelmointikielten suunnitteluratkaisujen hyödyllisyyttä on tutkittu jonkin verran: kaiken kaikkiaan tutkimuksia löytyi 141 ja satunnaistettuja kontrolloituja kokeita 22. Eniten on tutkittu eri tapoja ilmaista suorituksen haarautumista (11 koetta ytimessä, joista 8 kontrolloituja, joista 3 satunnaistettuja; vanhin tutkimus julkaistu 1973), valintaa staattisen ja dynaamisen tyypityksen välillä (6 tutkimusta ytimessä, joista 5 kontrolloituja kokeita, joista 4 satunnaistettuja; vanhin tutkimus julkaistu 2009), sekä eri tapoja ilmaista silmukkarakenne (5 tutkimusta ytimessä, joista 4 kokeita, joista 3 kontrolloituja ja yksi satunnaistettu; vanhin tutkimus julkaistu 1978). Hyödyllisyyttä on tutkimuksissa tarkasteltu pääasiassa virhealttiuden, ohjelmien ymmärrettävyyden sekä ohjelmointityön työläyden kautta.

Tutkimusmenetelmistä suosituin ytimessä oli (määrällinen) koe, jota käytti 41 tutkimusta. Toiseksi suosituin 11 tutkimuksella oli tutkimusasetelma, jossa olemassa olevia ohjelmia muokattiin käyttämään uutta ohjelmointikielen suunnitteluratkaisua hyväkseen. Kolmanneksi suosituin 8 tutkimuksella oli ohjelmistokorpuksen analyysi. Ytimessä käytettiin lisäksi tapaustutkimusta (2), kyselyä (2) ja ohjelmaparien analysointia (1). Ytimen kokeellisissa tutkimuksissa yleisimmin koehenkilöinä käytettiin ohjelmoijia (35 koetta), jotka tavallisimmin olivat ohjelmoinnin opiskelijoita (29 koetta).

Kartoituksen tuloksista on pääteltävissä varsin masentava kuva tämän kartoituksen alueeseen kuuluvasta tutkimusaktiviteetista. Vaikuttaa siltä, että aina silloin tällöin joku tutkija tai tutkimusryhmä keksii, että tällaiset tutkimukset olisivat hieno juttu, ja tekee niitä sitten muutaman kunnes kyllästyy ja vaihtaa aihetta. Julkaistut tutkimukset eivät vaikuttaisi inspiroineen kovin paljoa jatkotutkimuksia, eikä paradigman perustavia esimerkkitutkimuksia näytä syntyneen. On kuitenkin mahdollista, että viimeisen viiden vuoden aikana lisääntynyt tutkimustoiminta tarkoittaa, että tilanne on muuttunut; mutta koska lukumäärät ovat edelleen pieniä, saattaa tilanne palata jokusen vuoden jälkeen takaisin matalan aktiviteetin tasolle. Valitettavasti kartoitukseni aineistosta ei ole mahdollista päätellä mitään viime vuosien tutkimustoiminnasta.

Lisensiaatintyöni kuluessa tein havainnon, että ohjelmointikielten alan tutkimusartikkeleiden tiivistelmät ovat varsin hyödyttömiä, sillä niissä ei useinkaan kerrota tutkimuksen empiirisen osan metodia eikä sillä saatuja tuloksia. Tähän voisi mahdollisesti saada hyötyä muilla aloilla jo käytössä olevasta rakenteisen tiivistelmän ideasta, jota olen tämänkin työn englanninkielisessä tiivistelmässä (abstract) soveltanut.

Kuten kaikilla tutkimuksilla, tällä lisensiaatintyöllä on rajoitteita, jotka tulee tuloksia tulkittaessa ottaa huomioon. Keskeisin rajoite on, että julkaisujen mukaan ottamisessa ja tutkimusten koodauksessa on voinut sattua virheitä, vaikka niitä on pyritty välttämään ja löytämään. On myös mahdollista, että joitakin asiaan liittyviä tutkimuksia ei ole löytynyt hauissa eikä siksi ole kartoituksessa huomioitu.

Kartoitukseni johtopäätös on, että näyttöön pohjautuvan ohjelmointikielten suunnittelun tueksi on olemassa jonkin verran empiiristä tutkimusnäyttöä, mutta vain muutamaa suunnitteluratkaisua on tutkittu laajemmin. Kielten suunnittelijat saattavat hyötyä kartoituksessa löydettyihin tutkimuksiin tutustumisesta, erityisesti haarautumista, silmukkaa, staattista ja dynaamista tyypitystä, luokkaperintää, tapahtumapohjaista muistia ja sisennystä koskien. Kartoituksen alan kuuluvaa tutkimusta harjoittavien tutkijoiden on syytä tutustua kritiikkiin, jota kirjallisuudessa on esitetty aiempia tutkimuksia vastaan. Lisäksi, kuten järjestelmällisissä toisiotutkimuksissa on tapana, totean, että uusien ensiötutkimusten tekeminen on tarpeen; erityisesti haarautumista koskevat tutkimukset ovat jo iäkkäitä eivätkä ne välttämättä vastaa kovin hyvin nykyoloja. Joistakin aiheista on mahdollisesti myös hyödyllistä laatia järjestelmällisiä katsauksia.

Licentiate Thesis is now publicly available

My recently accepted Licentiate Thesis, which I posted about a couple of days ago, is now available in JyX.

Here is the abstract again for reference:

Kaijanaho, Antti-Juhani
The extent of empirical evidence that could inform evidence-based design of programming languages. A systematic mapping study.
Jyväskylä: University of Jyväskylä, 2014, 243 p.
(Jyväskylä Licentiate Theses in Computing,
ISSN 1795-9713; 18)
ISBN 978-951-39-5790-2 (nid.)
ISBN 978-951-39-5791-9 (PDF)
Finnish summary

Background: Programming language design is not usually informed by empirical studies. In other fields similar problems have inspired an evidence-based paradigm of practice. Central to it are secondary studies summarizing and consolidating the research literature. Aims: This systematic mapping study looks for empirical research that could inform evidence-based design of programming languages. Method: Manual and keyword-based searches were performed, as was a single round of snowballing. There were 2056 potentially relevant publications, of which 180 were selected for inclusion, because they reported empirical evidence on the efficacy of potential design decisions and were published on or before 2012. A thematic synthesis was created. Results: Included studies span four decades, but activity has been sparse until the last five years or so. The form of conditional statements and loops, as well as the choice between static and dynamic typing have all been studied empirically for efficacy in at least five studies each. Error proneness, programming comprehension, and human effort are the most common forms of efficacy studied. Experimenting with programmer participants is the most popular method. Conclusions: There clearly are language design decisions for which empirical evidence regarding efficacy exists; they may be of some use to language designers, and several of them may be ripe for systematic reviewing. There is concern that the lack of interest generated by studies in this topic area until the recent surge of activity may indicate serious issues in their research approach.

Keywords: programming languages, programming language design, evidence-based paradigm, efficacy, research methods, systematic mapping study, thematic synthesis

A milestone toward a doctorate

Yesterday I received my official diploma for the degree of Licentiate of Philosophy. The degree lies between a Master’s degree and a doctorate, and is not required; it consists of the coursework required for a doctorate, and a Licentiate Thesis, “in which the student demonstrates good conversance with the field of research and the capability of independently and critically applying scientific research methods” (official translation of the Government decree on university degrees 794/2004, Section 23 Paragraph 2).

The title and abstract of my Licentiate Thesis follow:

Kaijanaho, Antti-Juhani
The extent of empirical evidence that could inform evidence-based design of programming languages. A systematic mapping study.
Jyväskylä: University of Jyväskylä, 2014, 243 p.
(Jyväskylä Licentiate Theses in Computing,
ISSN 1795-9713; 18)
ISBN 978-951-39-5790-2 (nid.)
ISBN 978-951-39-5791-9 (PDF)
Finnish summary

Background: Programming language design is not usually informed by empirical studies. In other fields similar problems have inspired an evidence-based paradigm of practice. Central to it are secondary studies summarizing and consolidating the research literature. Aims: This systematic mapping study looks for empirical research that could inform evidence-based design of programming languages. Method: Manual and keyword-based searches were performed, as was a single round of snowballing. There were 2056 potentially relevant publications, of which 180 were selected for inclusion, because they reported empirical evidence on the efficacy of potential design decisions and were published on or before 2012. A thematic synthesis was created. Results: Included studies span four decades, but activity has been sparse until the last five years or so. The form of conditional statements and loops, as well as the choice between static and dynamic typing have all been studied empirically for efficacy in at least five studies each. Error proneness, programming comprehension, and human effort are the most common forms of efficacy studied. Experimenting with programmer participants is the most popular method. Conclusions: There clearly are language design decisions for which empirical evidence regarding efficacy exists; they may be of some use to language designers, and several of them may be ripe for systematic reviewing. There is concern that the lack of interest generated by studies in this topic area until the recent surge of activity may indicate serious issues in their research approach.

Keywords: programming languages, programming language design, evidence-based paradigm, efficacy, research methods, systematic mapping study, thematic synthesis

A Licentiate Thesis is assessed by two examiners, usually drawn from outside of the home university; they write (either jointly or separately) a substantiated statement about the thesis, in which they suggest a grade. The final grade is almost always the one suggested by the examiners. I was very fortunate to have such prominent scientists as Dr. Stefan Hanenberg and Prof. Stein Krogdahl as the examiners of my thesis. They recommended, and I received, the grade “very good” (4 on a scale of 1–5).

The thesis has been accepted for publication published in our faculty’s licentiate thesis series and will in due course appear has appeared in our university’s electronic database (along with a very small number of printed copies). In the mean time, if anyone wants an electronic preprint, send me email at antti-juhani.kaijanaho@jyu.fi.

Figure 1 of the thesis: an overview of the mapping process
Figure 1 of the thesis: an overview of the mapping process

As you can imagine, the last couple of months in the spring were very stressful for me, as I pressed on to submit this thesis. After submission, it took me nearly two months to recover (which certain people who emailed me on Planet Haskell business during that period certainly noticed). It represents the fruit of almost four years of work (way more than normally is taken to complete a Licentiate Thesis, but never mind that), as I designed this study in Fall 2010.

Figure 8 of the thesis: Core studies per publication year
Figure 8 of the thesis: Core studies per publication year

Recently, I have been writing in my blog a series of posts in which I have been trying to clear my head about certain foundational issues that irritated me during the writing of the thesis. The thesis contains some of that, but that part of it is not very strong, as my examiners put it, for various reasons. The posts have been a deliberately non-academic attempt to shape the thoughts into words, to see what they look like fixed into a tangible form. (If you go read them, be warned: many of them are deliberately provocative, and many of them are intended as tentative in fact if not in phrasing; the series also is very incomplete at this time.)

I closed my previous post, the latest post in that series, as follows:

In fact, the whole of 20th Century philosophy of science is a big pile of failed attempts to explain science; not one explanation is fully satisfactory. […] Most scientists enjoy not pondering it, for it’s a bit like being a cartoon character: so long as you don’t look down, you can walk on air.

I wrote my Master’s Thesis (PDF) in 2002. It was about the formal method called “B”; but I took a lot of time and pages to examine the history and content of formal logic. My supervisor was, understandably, exasperated, but I did receive the highest possible grade for it (which I never have fully accepted I deserved). The main reason for that digression: I looked down, and I just had to go poke the bridge I was standing on to make sure I was not, in fact, walking on air. In the many years since, I’ve taken a lot of time to study foundations, first of mathematics, and more recently of science. It is one reason it took me about eight years to come up with a doable doctoral project (and I am still amazed that my department kept employing me; but I suppose they like my teaching, as do I). The other reason was, it took me that long to realize how to study the design of programming languages without going where everyone has gone before.

Debian people, if any are still reading, may find it interesting that I found significant use for the dctrl-tools toolset I have been writing for Debian for about fifteen years: I stored my data collection as a big pile of dctrl-format files. I ended up making some changes to the existing tools (I should upload the new version soon, I suppose), and I wrote another toolset (unfortunately one that is not general purpose, like the dctrl-tools are) in the process.

For the Haskell people, I mainly have an apology for not attending to Planet Haskell duties in the summer; but I am back in business now. I also note, somewhat to my regret, that I found very few studies dealing with Haskell. I just checked; I mention Haskell several times in the background chapter, but it is not mentioned in the results chapter (because there were not studies worthy of special notice).

I am already working on extending this work into a doctoral thesis. I expect, and hope, to complete that one faster.

Philosophy matters

What we now know as physics and mathematics, and as many other disciplines of science, originated in philosophy and eventually split from it when the training of a physicist (or mathematician, or…) became sufficiently different from the training of a philosopher that they became essentially different traditions and skill sets. Thus, it may be said (correctly) that the legitimate domain of philosophy has shrunk considerably from the days of Socrates to the present day. Some people have claimed that it has shrunk so much as to make legitimate philosophy trivial or, at least, irrelevant. That is a gross misjudgment.

Consider science (as I have in my past couple of posts). Science generally delivers sound results, I (and a lot of other people) believe. Why does it? This is a question of philosophy; in fact, it is the central question of the philosophy of science. It is also a question that science itself cannot answer, for that would be impermissible circular reasoning (science works because science works). It is therefore a question of legitimate philosophy. It is not trivial, for once one gets past the knee-jerk reactions, which amount to “science works because it’s science”, there are no easy answers.

In fact, the whole of 20th Century philosophy of science is a big pile of failed attempts to explain science; not one explanation is fully satisfactory. Absent a common convincing philosophical grounding, there is room for the development of competing schools of thought even within a single discipline, and this, in fact, did happen (and still causes strong feelings). Fundamental disagreements about what can be known, what should be known, and how one goes about establishing knowledge are still unresolved.

Most scientists enjoy not pondering it, for it’s a bit like being a cartoon character: so long as you don’t look down, you can walk on air.

Creative Commons License
This work is licensed under a Creative Commons Attribution-ShareAlike 4.0 International License.

Beware of unnecessary commitment

The most elementary and valuable statement in science, the beginning of wisdom is, “I do not know”.

It may seem strange for me to open a blog post on the philosophy of knowledge and science with a video clip and a quotation from a rather cheesy episode of Star Trek The Next Generation (Where Silence Has Lease), a science fiction show not celebrated for its scientific accuracy. However, that quotation hit me like a ton of bricks when I saw that episode the first time more than twenty years ago. It has the same kind of wisdom as the ancient pronouncement, attributed to the god in Delphi by Socrates:

Human beings, he among you is wisest who knows like Socrates that he is actually worthless with respect to wisdom.

(This quote is at 23b of Socrates’ Defense [traditionally translated under the misleading title “Apology”] by Plato, as translated by Cathal Woods and Ryan Pack.)

The great teaching of these two quotes is, in my view, that one must keep an open mind: it is folly to think, mistakenly, that one knows something, and one should always be very careful committing to a particular position.

Of course, not all commitments are of equal importance. Most commitments to a position are limited: one might commit to a position only briefly or tentatively, for the sake of the argument and for the purposes of testing that position (these recent blog posts of mine on philosophy are of just this kind), or one might commit to a position in an entrance exam, for the purpose of gaining entry to a school. Some commitments are permanent: for example, knowingly allowing surgery to remove one’s colon is a powerful and irreversible commitment, but then, so is the decision not to take the surgery if one has a diagnosed colorectal cancer (although that decision may be reversible for a while, but not indefinitely).

The key thing, in my view, is to make only necessary commitments. Remember my previous post, where I argued that life is a big gamble? A commitment is necessary, in my view, if it follows from making a real-life bet with real-life consequences. For example, if one acquiesces to the removal of one’s colon as a treatment for colorectal cancer, one is betting one’s life on that decision, and thus the implied commitment to its superiority as a treatment (compared to, say, just eating healthily) is necessary. Conversely, a commitment is unnecessary if it is not connected to any real-life decision with significant stakes.

One thing that bothers me about the current paradigm of science (in all disciplines I am familiar with) is a fetish for unnecessary commitment. A researcher is expected to commit to an answer to their research question in their report, even though, most times, all they manage to do is provide evidence that will slightly alter a person’s probability assignment regarding that question. In most cases, this commitment is unnecessary, in that the researcher does not bet anything on the result (though there are significant exceptions). This fetish has the unfortunate consequence that statistical methodology is routinely misused to produce convincing-sounding justifications for such commitments. Even more unfortunate is that most studies pronounce their judgments based only on their own data, however meager, and all but ignore all other studies on the same question (technically speaking, they fail to establish the prior). Many other methodological issues arise similarly from the fetish to unnecessary commitment.

Of course, necessary commitments based on science occur all the time. If I step on a bridge, I am committing myself to the soundness of brige building science, among other things. We, the humanity, have collectively already committed ourselves to the belief that global climate change is not such a big deal (otherwise, we would have been much more aggressive about dealing with it in the decades past). Every day, we commit ourself to the belief that Newtonian and Einsteinian physics are sound enough that they correctly predict that the sun rises tomorrow.

But it is unnecessary for me to commit to any particular theory as to why MH370 vanished without trace, since it is only, pardon the expression, of academic interest to me.

Life is a gamble

You might die today. You might suffer a stroke, for example. Or, if you venture to the streets, you might be hit by a car. Or if you happen to be outside, you might be hit by a lightning. Or a meteor might strike near enough to you. Or World War III might start with a megaton warhead exploding nearby.

You might, also, if you are single, find your true love today. Or you might be crowned the Supreme Ruler of the Known Universe. Or you might find an important person’s wallet and be able to collect a huge finder’s fee. Or you could find yourself the single winner of a jackpot in the national lottery.

One purpose of the long childhood and adolescence of humans is to allow time for one to be taught and otherwise acquire the necessary skills in the great gamble that we call life. One learns to pay attention to the important things: take care to look both ways before crossing a road, for example. One learns to avoid the really dangerous things, such as touching a hot stove burner with an unprotected hand, or poking inside a live electric socket with an iron nail. Most importantly, one learns to learn, to adapt to new situations.

One learns to emphasize taking into account dangers and opportunities that one regards as likely: for example, when crossing a road, one looks left and right, since those are the directions where one is likely to see approaching vehicles that pose a collision danger. One learns to ignore extremely unlikely dangers: when crossing a road, one does not look up to see if a helicopter is about to land on the road. One also learns to adjust these likelihood estimates based on observations: if there’s a loud noise above, looking up is warranted; there actually might be a helicopter approaching to land.

An aircraft being towed after forced landing on a highway.  Picture by Vermont Agency of Transportation via Flickr
An aircraft being towed after forced landing on a highway. Picture by Vermont Agency of Transportation via Flickr

Most people would not describe making the decision to cross a road as rolling dice, but that’s effectively what it is. Even if one is extremely careful, even if one has looked both left and right, and up for good measure, when one steps on the road and starts walking across it, one exposes oneself to risk. The risk is first that something out of the ordinary happens: a speeding motorcyclist traveling 200 km/h will not be noticed by a pedestrian in a routine safe-to-cross check, and the motorcyclist will not have time to take action to avoid a collision. The risk is also that one’s vigilance might be below par: a pedestrian who always crosses this road at this time of the day, not having encountered any conflicting traffic in the years and years they have taken this road, might not look as carefully as one should. There is also the risk of the extremely unlikely case of an airplane with a total engine failure making a silent forced landing on that road just as one is crossing it (bad news for both the plane and the pedestrian).

The gamble aspect is clearer (but sometimes misunderstood) when a doctor offers a patient the choice between an elective surgery and continuing with noninvasive treatment. Any surgical operation has the risk of death on the operating table; for most elective operations, where the surgeon can screen out high-risk patients, the risk is low, but patients do die in elective surgery all the time. An operation also has the risk of other adverse events, ranging from later death due to, for example, a massive pulmonary embolism, to less drastic ones. Any patient realizes this, and weighs whether the rewards justify the risks. What people sometimes miss is to take account that one is not adding risk but trading risk, for the noninvasive option also carries risk (in some cases even the risk of sudden death at about the time of the surgery would have taken place). However one looks at it, the key point is that one is rolling dice.

There is a philosophical theory that looks at life in this way. The practical knowledge each person carries in their head is modeled as a large table containing an entry for every eventuality that the person can conceive, listing for each the probability of it occurring at this instant, as that person reckons the probability. Each person’s table is different, reflecting differences in life experiences and personality. Each time one makes a new observation about the current situation or takes a decision that changes the situation, the table is instantly updated to match their personal probability for each eventuality in light of the new information or change of circumstances.

The theory requires that the probabilities in the table follow the laws of the formal theory of probability. The theory also says that a person is rational if their table of probabilities never leads them to placing a set of bets that is a sure loser; for example, betting for the equation 2+2=5 (assuming conventional meanings for the symbols in the equation) is a sure loser, while betting for Elvis being alive is not (it’s just very very unlikely). In technical terminology, a set of bets that will surely lose is a Dutch book (I do not know the etymology of that phrase); the theory thus states that rationality means not being vulnerable to a Dutch book.

This theory is, for historical reasons, called Bayesianism (though that term also encompasses other closely related theories); some authors use the more descriptive name of subjective probability. There are three key ideas: first, a probability is always defined in relation to some agent (a person, a computer program, etc), whose history shapes the probability; second, an agent learns by adjusting its probability estimates based on new data; and third, an agent’s actions can be viewed as bets.

Consider any particular moment when an agent receives a new piece of information. The probability the agent has assigned to a particular eventuality just before it receives that information is called its prior probability or just its prior for that eventuality. Conversely, the probability the agent assigns to an eventuality in response to the new information is called its posterior probability or its posterior for that eventuality. There are a number of proposed rules for deriving the posterior from the prior and the nature of the new information; the oldest and best established is based on Thomas Bayes’s Theorem of formal probability theory, which explains the name Bayesianism.

A number of philosophers and scientists of the early 20th Century found the inherent subjectivity of Bayesianism repugnant, and developed several alternative theories, common to all being the idea of the probability of an eventuality being an objective trait of the eventuality, not tied to any particular agent like Bayesianism decrees. Standard statistical inference, as it is taught in most courses of statistics and employed in most statistical studies over the 20th Century and even in this century, is the best known of these alternative theories. All approaches share the same formal theory of probability, but the way they apply it to real-world situations is different. While standard statistics is defensible on its own right, it has been mistaught and misapplied by scientists for nearly a century… but that’s a whole another blog post.

Nevertheless, I find the informal version of Bayesianism that I have described to be a very good rule-of-thumb model of many things. Consider, for example, the question of truth I discussed in my two previous posts (1, 2). Two witnesses may have a totally different view on whether the defendant handled a knife at the scene; this can be viewed as the result of ambiguous information (what each witness actually saw) combining with their personal priors to yield their respective posteriors, which they then offer to the court (no pun intended!). It also neatly explains why people come to different conclusions on such things as the existence of God, the efficacy of homeopathy, and the danger from global climate change; different people have different priors (which is another way of saying that they have different prejudices), and the evidence they have seen is, for many people, not sufficiently impressive to make an appreciable difference to their posteriors. As the saying goes, extraordinary claims require extraordinary evidence… but each person has their own idea of what is extraordinary.

There is, however, some hope. A theorem has been proved saying, in effect, that no matter how divergent the viewpoints of people on some point, sufficient evidence can be imagined to make them all agree. Whether that imagined evidence can, in practice, be found, is another matter.

There is another point to consider, as well. There are eventualities that are fatal. If one’s prior for even one of those eventualities is way too low, one will eventually be killed. On the Internet, we call that earning the Darwin Award; it is the Nature’s way to manage our collective priors.

The many faces of truth

In my previous post, I discussed why I think there is no such thing as “the” truth, a single truth valid for all intents and purposes. Truth is much more complicated.

Consider the oath administered to a witness at trial. It binds them to “tell the truth, the whole truth, and nothing but the truth” (exact phrasing varies from jurisdiction to jurisdiction), but nobody seriously expects them to tell what actually took place, in a correspondence-theory sense. For one thing, a witness is, of course, restricted by the fact that they are but one person at the scene, and can observe the event from only one place at any one time. Further, a human being is not (with few exceptions) a mere recording machine, able to relay the exact image they saw or the exact sounds they heard; instead, they interpret their observations and store those interpretations (instead of the observations) in their memory, and then relay those interpretations (likely with memory-induced corruption) in speech. Finally, human memory is fallible, and it becomes more unreliable as time moves on, away from the event itself.

Let us imagine a court conducting a criminal trial. This one is a Finnish court, where a witness is always directed to tell what they know of the event in their own words; questions from the lawyers, if any, come later.

THE FIRST WITNESS is a 19-year-old woman, active in the leftist environmentalist movement. Her examination goes as follows:

PROSECUTOR: Are you aware of what event is at issue in this trial?

WITNESS: Yes.

PROSECUTOR: Please recount in your own words what happened. How did you come to be at the scene, what events did you observe?

WITNESS: Well… We held an anti-nuclear demonstration at the construction site of the new nuclear plant. It was a peaceful event, of course. We carried signs and sang songs. Some troublemakers I had never seen before started throwing stones and perhaps some other things toward the construction site. The police patrol who had been standing nearby ordered us all to leave. My friend Petri went to talk to them, quite peacefully, and suddenly the police started beating him, threw him to the ground, handcuffed him and dragged him to the police van. They did all that to him, without any provocation! All the while the troublemakers continued their activities. That’s all, I think.

PROSECUTOR: By Petri, you mean Petri Suuro, the accused?

WITNESS: Yes.

PROSECUTOR: Did you see what Petri was doing before the police came to him?

WITNESS: I’m sure he did not do anything justifying their brutality.

PROSECUTOR: Nothing further for the main examination, President.

PRESIDING JUDGE: Anything on behalf of Mr. Suuro? (Advocate shakes head.) All right, any cross examination? Prosecutor?

PROSECUTOR: Thank you, President. (To the witness:) Where was your attention directed to just prior to the police coming to Petri?

WITNESS: I was looking at everything, I believe.

PROSECUTOR: So you saw what the troublemakers were doing, and also what Petri was doing?

WITNESS: Yes, absolutely.

PROSECUTOR: Was Petri standing still, or walking? Did he have anything in his hand, perhaps?

WITNESS: He was standing still, but shouting to the troublemakers. I think his hands were in his pockets; that’s what he usually does with his hands.

PROSECUTOR: Do you know, did Petri carry a knife in pocket?

WITNESS: Absolutely not. He’s a peaceful guy; why would he carry a knife to a demonstration?

PROSECUTOR: You did not see him take a knife out of his pocket?

WITNESS: What knife? He did not have a knife to take out. No, I never saw any knife in his possession.

PROSECUTOR: Nothing further.

PRESIDING JUDGE: On behalf of Mr. Suuro? All, right, let’s end the examination.

THE SECOND WITNESS is a 60-year old man, in charge of the nuclear plant construction site.

PROSECUTOR: You are aware of what events we are discussing today?

WITNESS: Yes, I am.

PROSECUTOR. Please tell us what you know of the events leading to the police arresting Mr. Suuro, sitting there. How did you come to be there at the time, and what did you observe?

WITNESS: Very well. I am a senior manager of the company building the new nuclear plant, and I am in charge of the construction site. We knew, of course, of the planned demonstration; it was fairly well publicized by the demonstrators. We also received information we believed to be somewhat reliable suggesting that the demonstrators intended to set fire to the construction site. I contacted the police and made sure they were present in force. I also attended myself, just in case.

The demonstration started peacefully enough, but soon the demonstrators started throwing rocks and Molotov cocktails through the outer fence. Some did in fact manage to get fires started, and it was difficult for our fire crews to do their job due to the flying rocks. I went to the police to demand action, but they were faster, and were already taking action. My attention was grabbed by shouting near me; it was a young man, I believe Mr. Suuro here, having a loud argument with two police officers. I noticed particularly that Mr. Suuro started to take a knife from his pocket, at which point the officers quite properly restrained him and put him into one of the police vans.

PROSECUTOR: Did you observe the behaviour of Mr. Suuro before the argument with the police?

WITNESS: For me, he was just one of many, and I did not pay him any attention before the argument. I was not even particularly aware of him.

PROSECUTOR: Nothing further.

PRESIDING JUDGE: On behalf of Mr. Suuro?

MR. SUURO’S ATTORNEY: Cross examination only, President.

PRESIDING JUDGE: All right, cross examination, then. Prosecutor?

PROSECUTOR: Nothing, President.

PRESIDING JUDGE: On behalf of Mr. Suuro?

MR. SUURO’S ATTORNEY: Thank you, President. (To the witness:) Did you see the knife well?

WITNESS: I saw it was a knife.

MR. SUURO’S ATTORNEY: What sort of a knife?

WITNESS: A normal kind. Shiny.

MR. SUURO’S ATTORNEY: A butter knife? A puukko? A bread knife?

WITNESS: An average knife. I can’t remember what kind exactly.

MR. SUURO’S ATTORNEY: What was your attention focused on when you saw the knife?

WITNESS: I was looking at Mr. Suuro and the police officers.

MR. SUURO’S ATTORNEY: Not the knife specifically?

WITNESS: I noticed the knife, particularly how it shone.

MR. SUURO’S ATTORNEY: But you were not looking at Mr. Suuro’s hands or pockets when you noticed the knife?

WITNESS: Not specifically. I was looking at the three people involved. But I did see the knife.

MR. SUURO’S ATTORNEY: Apart from the knife, did Mr. Suuro behave aggressively toward the police officers?

WITNESS: He was clearly agitated. Used a loud, angry voice.

MR. SUURO’S ATTORNEY: Just words, no action?

WITNESS: The kife was action. But otherwise no.

MR. SUURO’S ATTORNEY: Do you remember what Mr. Suuro was saying to the officers?

WITNESS: He was, I think, telling them to stop pestering him and go after the real troublemakers.

MR. SUURO’S ATTORNEY: Was Mr. Suuro one of the troublemakers in your estimate?

WITNESS: He made trouble to the officers, and he was one of the demonstrators.

MR. SUURO’S ATTORNEY: Other than the altercation with the police, and his presence at the scene, was he a troublemaker in your estimate? Did you see him throw stones or Molotov’s cocktails, for example?

WITNESS: Not when he was arguing with the officers. Before that I had not paid him any attention,

MR. SUURO’S ATTORNEY: And afterwards he was under arrest. Nothing further, President.

PRESIDING JUDGE: Prosecutor? (Prosecutor shakes head.) All right, this examination is closed.

Mr. Suuro is, in this fictional court case, charged of several offences. Relevant to these two witness examinations are that he allegedly carried a knife illegally in public, and that he attempted an assault of police officers with that knife. Since these are fictional examinations, and I created these two witnesses myself, I can authoritatively say that neither lied. Both complied with the oath they had taken (the local phrasing goes “[…] I shall testify and state the whole truth in this case, without concealing it, adding to it or altering it”). Yet, they tell a completely different story.

The first witness was acquainted with Mr. Suuro and thus had a measure of his character. From her point of view, what happened was an unprovoked attack on a peaceful protestor by the police. This is, of course possible. The second witness reports having seen the knife, but it is possible that he mistook something else for it, or simply imagined it, being undisposed to believing the police taking such action without provocation.

The second witness was predisposed to think ill of Mr. Suuro; not out of personal malice but simply because Mr. Suuro and he were on different sides of a confrontation, and held deeply divergent political views. His account is also plausible; the first witness may not know Mr. Suuro as well as she thinks, and she may not have paid enough attention to him during the demonstration.

It is notable that the first witness does not profess to have seen the absence of a knife; she merely states she knows it was not there. She could be right, of course; she might actually have seen that there was nothing in the pocket, but she only remembers a deep conviction, not the details. She could be wrong, as well; she may have been positioned so that she did not see that pocket and the hand where the knife was, and her strong belief in the absence of the knife, and the haziness of memory by the time of the trial, might make her think she saw its absence.

Now, I deliberately wrote these vingettes to be ambigous; even I do not know who is right. I would have to decide, were I to write a complete story about these fictional events; but for the purposes of this blog post, we do not know.

A court, of course, is obligated to decide the case. Given these two testimonies and no other evidence, it has no choice but to choose who is more believable. If more evidence is presented, it of course may become easier to decide the case. In either case, however, the court does not decide what actually happened. What it decides is whether using the power of the state against Mr. Suuro is warranted given the evidence it has heard. The English translation of the Finnish Code of Judicial Procedure (Chapter 17 Section 2 Paragraph 1) puts it this way:

After having carefully evaluated all the facts that have been presented, the court shall decide what is to be regarded as the truth in the case.

Note that the court is not tasked to determine the truth but what is to be regarded as the truth. What actually took place may be something different. For example, in a criminal case, the standard of reasonable doubt applies: everybody may be convinced that the accused did it (and it even may be that the accused really did do it), but if the court cannot justify to itself why a minority theory must be disregarded as unreasonable, then what is to be regarded as the truth is that the accused did not do it.

We have here already two faces of truth: the truth required of a witness, and the truth required of a court of law. A witness’s truth is that of honesty, not of correspondence to reality. The truth of a court of law is that of justice and of the moderation of the power of the state; again, not of correspondence to reality.

There is at least a third face of truth. When a medical scientist says that a particular treatment is efficacious and safe for the treatment of particular class of patients, they are predicting the future. The scientist’s truth is a best guess derived from studies already performed, and any competent scientist will acknowledge that they might be wrong; there certainly are lots of historical examples of a scientist’s well-backed prediction not corresponding to the reality. Sometimes it is misconduct (but then it’s not well-backed); sometimes it is the application of research methods thought to be correct but later revealed to be faulty; and sometimes it is simply that our research methods, while still considered valid, are not 100 % reliable. But in many many cases, the prediction appears to be correct.

The scientist’s truth is, I think, the most complex. It is partly the same as the witness’s: it is truth of honesty. But it is also something more: the truth of competence, and the truth of expertise. And, like all faces of truth, sometimes it turns out not to correspond with the reality.

I think the best way to see truth is in this way: not as correspondence to reality but as an obligation of a human being. Truth is about honesty, justice, moderation of power, competence, and expertise, mixed as appropriate.

The Truth – what a load of nonsense!

I have been interested in science since before I can remember. I was reading popular science, I believe, before first grade. I was reading some undergraduate textbooks (and the occasional research monograph ­– not that I understood them) before high school. I did this in order to learn the truth about what the world is. Eventually I learned enough to realize that I wanted to really understand General Relativity. For that, I learned from a book I no longer recall, I needed to learn tensors.

Albert Einstein, Etching by Ferdinand Schmutzer 1921, via Wikimedia Commons

I, therefore, set myself on the project of building my mathematical knowledge base sufficiently far to figure tensors out. By the time I started high school (or rather, its equivalent), I had switched from desiring mathematics as a tool (for understanding Relativity) to taking it as a goal in itself. My goal for the next few years was to learn high school math well enough to be able to study mathematics at university as a major subject. I succeeded.

During the high-school years I also continued reading about physics. My final-year project in my high-school equivalent was an essay (sourced from popular science books) on the history of physics, from the ancients to the early decades of the 20th Century, enough to introduce relativity and quantum mechanics, both at a very vague level (the essay is available on the web in Finnish). I am ashamed to note that I also included a brief section on a “theory” which I was too young and inexperienced then to recognize as total bunk.

At university, I started as a math major, and began also to study physics as a minor subject. Neither lasted, but the latter hit the wall almost immediately. It was by no means the only reason why I quit physics early, but it makes a good story, and it was a major contributor: I was disillusioned. Remember, I was into physics so that I could learn what the world is made of. The first weeks of freshman physics included a short course on the theory of measurement. The teacher gave us the following advice: if your measurements disagree with the theory, discard the measurements. What the hell?

I had understood from my popular-science and undergraduate textbook readings over many years that there is a Scientific Method that delivers truthful results. In broad outline, it starts from a scientist coming up, somehow (we are not told how), with a hypothesis. They would then construct an experiment designed to test this hypothesis. If it fails, the hypothesis fails. If it succeeds, further tests are conducted (also by other scientists). Eventually, a successfully tested hypothesis is inducted into the hall of fame and gets the title of a theory, which means it’s authoritatively considered the truth. I expected to be taught this method, only in further detail and with perhaps some corrections. I did not expect to be told something that directly contradicts the method’s central teaching: if the data and the theory disagree, the theory fails.

Now, of course, my teacher was teaching freshman students to survive their laboratory exercises. What he, instead, taught me was that physics is not a noble science in pursuit of the truth. I have, fortunately, not held fast to that teaching. The lasting lesson I have taken from him, though I’m sure he did not intend it, is that the scientific method is not, in fact, the way science operates. This lesson is, I later learned, well backed by powerful arguments offered by philosophers of science. (I may come back to those philosophical arguments in other posts, but for now, I’ll let them be.)

There are, of course, other sources for my freshman-year disillusionment. Over my teen years I had participated in many discussions over USENET (a very early online discussion network, now mostly obsolete thanks to various web forums and social media). Many of the topics I participated in concerned with the question of truth, particularly whether God or other spiritual beings and realms exist. I very dearly wanted to learn a definite answer I could accept as the truth; I never did. A very common argument used against religious beliefs was Occam’s razor: the idea that if something can be explained without some explanans (such as the existence of God), it should be so explained. Taken as a recipe for reasoning to “the truth”, it seems, however, to be lacking. Simpler explanations are more useful to the engineer, sure, but what a priori grounds there possibly can be for holding that explanatorily unnecessary things do not exist? For surely we can imagine a being that has the power to affect our lives but chooses to not wield it (at least in any way we cannot explain away by other means), and if we can imagine one, what falsifies the theory that one isn’t there?

Many scientists respond to this argument by invoking Karl Popper and his doctrine of falsification. Popper said, if I recall correctly (and I cannot right now be bothered to check), that the line separating a scientific theory from a nonscientific one is that the former can be submitted to an experiment which in principle could show the theory false, and for the latter there is no such experiment that could be imagined; in a word, a scientific theory is, by Popper’s definition, falsifiable. Certainly, my idea of a powerful being who chooses to be invisible is not falsifiable. While there are noteworthy philosophical arguments against Popper’s doctrine, I will not discuss them now. I will merely note that the main point of falsificationism is to say that my question is inappropriate; and therefore, from my 19-years-old perspective, falsificationism itself fails to deliver.

My conclusion at that young age was that science does not care about The Truth; it does not answer the question what is. It seemed to me, instead, that science seeks to answer what works: it seeks to uncover ways we can manipulate the world to satisfy our wants and needs.

My current, more mature conclusion is similar to the falsificationist’s, though not identical. The trouble is with the concept of truth. What is, in fact, truth?

In my teens, I did not have an articulated theory of truth; I took it as granted. I think what I meant by the term is roughly what is called the correspondence theory of truth by philosophers. It has two components. First, that there is a single universe, common to all sensing and thinking beings, that does not depend on anyone’s beliefs. Second, that a theory is true if for each thing the theory posits there is a corresponding thing in the universe possessing all the characteristics that the theory implies such a thing would have and not possessing any characteristics that the theory implies the thing would not have; and if the theory implies the nonexistence of some thing, there is no such thing in the universe. For example, if my theory states that there is a Sun, which shines upon the Earth in such a way that we see its light as a small circle on the sky, it is true if there actually is such a Sun having such a relationship with Earth.

Not quite my chair, but similar. Photo by Branden Baunach CC-BY 2.0, via Flickr
Not quite my chair, but similar. Photo by Branden Baunach CC-BY 2.0, via Flickr

Unfortunately, the correspondence theory must be abandoned. Even if one concedes the first point, the existence of objective reality, the second point proves too difficult. How can I decide the (correspondence-theory) truth of the simple theory “there is a chair that I sit upon as I write this”, a statement I expect any competent theory of truth to evaluate as true? Under the correspondence theory of truth, my theory says (among other things) that there is some single thing having chair-ness and located directly under me. For simplicity, I will assume arguendo that there are eight pieces of wood: four roughly cylindrical pieces placed upright; three placed horizontally between some of the upright ones (physically connected to prevent movement); and one flat horizontal piece placed upon the upright ones, physically connected to them to prevent movement, and located directly below my bottom. I have to assume these things, and cannot take them as established facts, because this same argument I am making applies to them as well, recursively. Now, given the existence and mutual relationships of these eight pieces of wood, how can I tell that there is a real thing they make up that has the chair-ness property, instead of the eight pieces merely cooperating but not making a real whole? This question is essentially, does there exist a whole, separate from its parts but consisting of them? Is there something more to this chair than just the wood it consists of? The fatal blow to the correspondence theory is that this question is empirically unanswerable (so long as we do not develop a way to talk to the chair and ask it point blank whether it has a self).

Scientists do not, I think, generally accept the correspondence theory. A common argument a scientist makes is that a theory is just a model: it does not try to grasp the reality in its entirety. To take a concrete example, most physicists are, I believe, happy to accept Newtonian physics so long as the phenomena under study satisfy certain preconditions so that Newtonian and modern physics do not disagree too much. Yet, it is logically impossible for both theory of Special Relativity and Newtonian physics to describe, in a correspondence theory sense, the same reality: if the theory of Special Relativity is correspondence-theory true, then Newtonian physics cannot be; and vice versa.

If not correspondence theory, then what? Philosophers of science have come up with a lot of answers, but there does not seem to be a consensus. The situation is bad enough that in the behavioural sciences there are competing schools that start with mutually incompatible answers to the question of what “truth” actually means, and end up with whole different ways of doing research. I hope to write in the future other blog posts looking at the various arguments and counterarguments.

For now, it is enough to say that it is naïve to assume that there is a “the” truth. Make no mistake – that does not mean that truth is illusory, or that anybody can claim anything as true and not be wrong. We can at least say that logical contradictions are not true; we may discover other categories of falsehoods, as well. The concept of “truth” is, however, more complicated that it first seems.

And, like physics, we may never be able to develop a unified theory of truth. Perhaps, all we can do is a patchwork, a set of theories of truth, each good for some things and not for others.

Going back to my first year or two at university, I found mathematics soothing, from this perspective. Mathematics, as it was taught to me, eschewed any pretense of correspondence truth; instead, truth in math was, I was taught, entirely based on the internal coherence of the mathematical theory. A theorem was true if it could be proven (using a semi-formal approach common to working mathematicians, which I later learned to be very informal compared to formal logic); sometimes we could prove one true, sometimes we could prove one false (which is to say, its logical negation was true), and sometimes we were not able to do either (which just meant that we don’t know – I could live with that).

I told my favourite math teacher of my issue with physics. He predicted I would have the same reaction in his about-to-start course on logic. I attended that course. I learned, among other things, Alfred Tarski’s definition of truth. It is a linguistic notion of truth, and depends on there being two languages: a metalanguage, in which the study itself is conducted and which is assumed to be known (and unproblematic); and an object language, the language under study and the language whose properties one is interested in. Tarski’s definition of truth is (and I simplify a bit here) to say that a phrase in the object language is assigned meaning based on its proffered translation. For example, if Finnish were the object language and English the metalanguage, the Tarskian definition of truth would contain the following sub-definition: “A ja B” is true in Finnish if and only if C is the translation of A, D is the translation of B, and “C and D” is true in English.

The Tarskian definition struck me initially as problematic. If you look up “ja” in a Finnish–English dictionary, you’ll find it translated as “and”. It now becomes obvious that Tarski’s definition does not add anything to our understanding on Finnish. And, indeed, it is one more link in the chain that says that mathematics is not concerned with correspondence truth. We cannot learn anything about the real world from studying mathematics. But I knew that already, and thus, in the end, Tarskian truth did not shatter my interest in mathematics.

I also learned in that course of Kurt Gödel’s famous incompleteness theorem. It states (and I simplify a lot) that a formal theory that is sufficiently powerful to express all of mathematics cannot prove its own coherence. This was the result my teacher was alluding to earlier, but it did not bother me. I had been taught from the beginning to regard mathematics as an abstract exercise in nonsense, valuable only for its beauty and its ability to satisfy mathematician’s intellectual lusts. What do I care that mathematics cannot prove itself sane?

Georg Cantor circa 1870. Photographer unkown.  Via Wikimedia Commons
Georg Cantor circa 1870. Photographer unkown. Via Wikimedia Commons

What I did not then know is the history. You see, up until the late 19th Century, I believe mathematicians to have adhered to a correspondence theory of truth. That is, mathematics was, for them, a way to discover truths about the universe. For example, the number two can be seen as corresponding to the collection of all pairs that exist in the universe. This is why certain numbers, once they had been discovered, were labeled as “imaginary”; the mathematicians who first studied them could not come up with a corresponding thing in the universe for such a number. The numbers were imaginary, not really there, used only because they were convenient intermediate results in some calculations that end up with a real (existing!) number. This is also, I believe, one of the reasons why Georg Cantor’s late 19th Century set theory, which famously proved that infinities come in different sizes, was such a shock. How does one imagine the universe to contain such infinities?

But more devastating were the paradoxes. One consequence of Cantor’s work was that infinities can be compared for size; also, that we can design a new numbering system (called cardinal numbers by set theorists) to describe the various sizes of infinity, such that every size of infinity has a unique cardinal number. Each cardinal number itself was infinite, of the size of that cardinal number. It stands to reason that the collection of all cardinal numbers is itself infinite, and since it contains all cardinal numbers (each being its own size of infinite), it is of a size of infinity that is greater than all other sizes of infinity. Hence, the cardinal number of all infinities is the greatest such number that can exist. But it can be proven that there is no such thing; every cardinal number has cardinal numbers that are greater than it. If one were to imagine that Cantor’s theory of the infinities does describe the reality, that would imply that the universe itself is paradoxical. This Cantor’s paradox isn’t the only one; there are many others discovered about the same time. Something here is not right.

A new branch of mathematics emerged from this, termed metamathematics, whose intent was to study mathematics itself mathematically. The idea was that finite stuff is well understood, and since it corresponds to the reality, we can be sure it is free of paradoxes. Metamathematicians aimed to rebuild the mathematics of the infinite from ground up, using only finite means, to see what of the infinity actually is true and what is an artefact of misuing mathematical tools due to poor understanding of them. This work culminated in two key discoveries of the 1930s: Kurt Gödel’s incompleteness theorem, which basically said that metamathematics cannot vindicate mathematics, and Alan Turing’s result that said that mathematics cannot be automated. Of course, the technique Turing used is his famous Machine, which is one of the great theoretical foundations of computer science.

Fast forward sixty years, to the years when I studied mathematics in university. The people who taught me were taught by the people who had been taught by the people who were subjected to Gödel and Turing’s shocks. By the time I came into the scene, the shock had been absorbed. I was taught to regard mathematics as an intellectual game, of no relevance to reality.

I eventually switched to software engineering, but I always found my greatest interest to be at the intersection of theoretical computer science and software engineering, namely the languages that people use to program computers. In theoretical computer science, the tools are of mathematics, but they are made relevant to reality because we have built concrete things that are mathematics. Mathematics is not relevant to reality because it describes reality, but because it has become reality! And since the abstract work of the computers derive their meaning, Tarski-like, from mathematics, we have no problem with a correspondence theory. Truth is, again, uncomplicated, and it was, for me, for many years.

Until I realized that computers are used by people. In the last couple of years I have been reading up on behavioural research, as it is relevant to my own interest in language design. Again, the question of truth, and especially how we can learn it, becomes muddled.

Forgive me, gentle reader. This blog post has no satisfactory ending. It is partly because there is no completely satisfactory answer that I am aware of. I will be writing more in later posts, I hope; but I cannot promise further clarity to you. What I can promise is to try to show why the issue is so problematic, and perhaps I can also demonstrate why my preferred answer (which I did not discuss in this post) is comfortable, even if not quite satisfactory.

(Please note that this is a blog post, not a scholarly article. It is poorly sourced, I know. I also believe that, apart from recounting my personal experiences, it is not particularly original. Please also note that this is not a good source to cite in school papers, except if the paper is for some strange reason about me.)

Poliittisen ohjelman käsittelystä

Olin juuri nostanut punaisen ei-ääntä tarkoittavan lapun ilmaan ja aloin katsella ympärilleni, kuten tapanani oli. Melkein takanani istui eräs toinen kokousedustaja, joka vielä mietti ääntään. Olin hänen kanssaan edellisenä iltana väitellyt kovasti juuri nyt äänestettävänä olevasta asiasta, ja tiesin hänen kantansa. Sanoin: “Kuule, punainen on feminismi, vihreä on tasa-arvo.” Ja niin nousi vihreä ääni. Äänestyksen jälkeen eräs edustaja vielä luuli asian olevan auki, ja jouduinkin sanomaan: “Sori, se meni jo.”

Vihreiden puoluekokous hyväksyi viime viikonloppuna puolueen poliittisen ohjelman seuraavalle eduskuntavaalikaudelle. Vastaava ohjelma hyväksytään joka neljäs vuosi ennen eduskuntavaaleja; samankaltainen prosessi käydään läpi seuraavan kerran kahden vuoden päästä, kun neljä vuotta kerrallaan voimassa oleva periaateohjelma uudistetaan. Perinteisesti prosessi on alkanut ensimmäisenä kokouspäivänä lauantaina, jolloin jätetään tuhottomasti muutosesityksiä pohjaesitykseen ja pidetään tuntikausia puheita (joita kuunnellaan toisella korvalla). Esitykset perataan ohjelmatyöryhmässä, joka kokoontuu lauantai-iltana ja usein työskentelee pitkälle aamuyöhön. Suurin osa toisesta kokouspäivästä, sunnuntaista, meneekin sitten ohjelman lopulliseen hyväksyntään. Tällä kertaa kokeiltiin kolmepäiväistä kokousta, jolloin esitykset jätettiin perjantaina ja työryhmä työskenteli koko lauantain.

Sunnuntaiaamuna puoluekokouksessa.
Sunnuntaiaamuna puoluekokouksessa. Kuvassa lähimmän pöydän ääressä oikealta vasemmalle Riitta Lätti, Antti-Juhani Kaijanaho ja Ville Korhonen, sekä meidän takanamme Marja Kupari ja selin Irene Hallamäki Kuva: Virpi Kauko

Sunnuntain käsittely on koko jutun avain. Aiemmin työryhmän työstämä uusi pohjaesitys tuli julki sunnuntai-aamuna; tällä kertaa se julkaistiin jo myöhään lauantai-iltana. Kumpikaan vaihtoehto ei anna kokousedustajille juurikaan aikaa kunnolla tutustua niihin kymmeniin sivuihin, joihin ne lukemattomat muutosesitykset on kirjattu. Oma strategiani onkin ollut jättää työryhmän hylkäämät esitykset huomiotta, ja keskityn tutustumaan vain hyväksyttyihin ja niihin, jotka työryhmä tuo isoon saliin äänestettäväksi. Lisäksi edustajat yleensä kokoontuvat sunnuntai-aamuna keskustelemaan pienissä ryhmissä esityksistä; omassa keskisuomalaisten edustajien ryhmässä kerkesimme tunnin kokoontumisen aikana keskustella lyhyesti kaikista keskisuomalaisten tekemistä esityksistä.

Itse käsittely isossa salissa etenee sunnuntaina seuraavasti. Työryhmän kokoama taulukko muutosesityksistä käydään sivu kerrallaan läpi. Työryhmän päätösesitys, jos sellainen on, on pohjaesitys; osaan muutosesityksistä työryhmä ei ole kirjannut päätöesitystä vaan asia tuodaan isoon saliin suoraan äänestykseen. Kuka tahansa äänivaltainen kokousedustaja voi jonkin sivun tullessa käsittelyyn vaatia lisäksi tietyn esityksen äänestyttämistä. Muutoin vain työjärjestyspuheenvuorot sallitaan; esitysten sisällöstä ei sallita keskustelua, eikä uusia esityksiä oteta vastaan. Ne työryhmän esitykset, joista ei äänestetä, hyväksytään sellaisenaan.

Kokouksen puheenjohtajan tehtävä on tässä erittäin vaikea. Hänen tulee ensinnäkin tunnistaa, mitkä äänestykseen tulevista esityksistä ovat toisistaan riippumattomia ja mitkä niistä pitää käsitellä yhtenä asiana. Yhdessä käsiteltävistä asioista hänen pitää laatia selkeä ja toimiva äänestysjärjestys. Pääsääntöisesti on käytetty yksityiskohtaista parlamentaarista äänestysjärjestystä, jossa ensin laitetaan vastakkain kaksi pohjasta eniten eroavaa esitystä, ja sen jälkeen voittaja laitetaan vastakkain seuraavaksi eniten pohjasta eroavaa esitystä vastaan, kunnes enää yksi esitys on jäljellä. Myös muita äänestysjärjestyksiä voidaan käyttää, kunhan lopulta voittajalla on enemmistön tuki.

Tässä kokouksen puheenjohtajisto teki useita huonoja ratkaisuja. Alun anekdootin feminismi versus tasa-arvo -äänestys oli niistä vakavin, mutta muitakin on. Asiassa nostettiin todella monta eri muotoiluesitystä äänestykseen; osassa puolueen sanottiin olevan tasa-arvon kannalla ja osasssa puhuttiin feminismistä. Kysymys feminismin mainitsemisesta oli yksi kokouksen avainkysymyksistä keskustelussa, ja se on puhuttanut puoluekokouksia aiemminkin. Puheenjohtajisto asetti esitykset yksityiskohtaisen parlamentaarisen järjestyksen mukaiseen äänestysjärjestykseen: koska feminismi oli pohjapaperissa mukana, pohjasta eniten poikkesivat kaikki tasa-arvoa käsittelevät esitykset, ja feminismimuotoilujen välillä päästiin äänestämään vasta, kun kaikki tasa-arvoesitykset olivat hävinneet. Tämä johti siihen, että ratkaiseva äänestys feminismi vastaan tasa-arvo käytiin keskellä pitkää äänestyssarjaa. Kun puheenjohtaja julisti äänestykset vain kertomalla, kenen tekemästä esityksestä oli kyse, moni edustaja ei varmastikaan tajunnut, että juuri nyt on se ratkaiseva äänestys. Parempi ratkaisu olisi ollut käyttää ryhmittäistä äänestystä, jossa ensiksi valitaan feminismin ja tasa-arvon väliltä ja sen jälkeen aletaan äänestää voittaneen kannan tarkasta muotoilusta. Myös sekin olisi riittänyt, jos puheenjohtaja olisi huomauttanut äänestyksen ratkaisevasta luonteesta.

Tämä on kuitenkin jälkiviisautta! En tiedä, olisinko itse siellä kokoussalin edessä toiminut paremmin. Sen tiedän, että en ainakaan tajunnut tätä ongelmaa itse ajoissa nostaakseni sitä esille työjärjestyspuheenvuorossa. Perussyy ongelmalle on, että kokouksessa on aivan liian vähän aikaa käsitellä kokouksessa esille nostettuja muutosesityksiä. Kiire vaikuttaa sekä puheenjohtajan päätöksiin että kokousedustajien tilannetietoisuuteen. Kiire johtaa myös siihen, ettei todellista keskustelua sallita kokouksessa esille tulleista esityksistä.

En tiedä tähän muuta ratkaisua kuin esitysten määrän pienentäminen (jonka myös Virpi Kauko mainitsee). Se onnistuu parhaiten lyhentämällä pohjapaperia. Seuraava ohjelma on toivottavasti enintään kymmenensivuinen (nyt pohjaesitys oli 26-sivuinen!).

On myös syytä pohtia, olisiko mahdollista aikaistaa äänestyspyyntöjä siten, ettei puheenjohtaja joudu laatimaan äänestysjärjestystä kovan paineen alaisena salin edessä ja kokousedustaja yrittämään ymmärtää sen merkitystä muutamassa kymmenessä sekunnissa. En kuitenkaan itse näe, kuinka tämä voitaisiin käytännössä toteuttaa järkevästi.