The Code Problem

The issue on my mind is whether dig­i­tal human­ists ought to code. This top­ic comes up all the time and rep­re­sents what appears to be an incred­i­bly vex­ing issue among ris­ing dig­i­tal human­ists who want to join the club, espe­cial­ly women, who can see right away that most coders are male and behave like they are mem­bers of some kind of boys club, in spite of their chum­my invi­ta­tions to join.  Most recent­ly Miri­am Pos­ner blogged about her prob­lems with the Thou Shalt Code com­mand­ment that seems to per­vade DH today. I won’t repro­duce her argu­ments, with which I am sym­pa­thet­ic, except to say that I think the issue goes beyond gen­der and race–white males feel this anx­i­ety too; they just don’t have the added bur­den of a cul­ture that tells them this par­tic­u­lar short­com­ing is the fault of their essen­tial nature. So they can blow it off and not feel too bad about it. Which is of course con­sis­tent with Miriam’s point (and of the XKCD car­toon she embeds). But I want to get at anoth­er issue beyond the back­ground radi­a­tion of implic­it bias. I want to get at the ratio­nale for cod­ing in DH in the first place, which is I think the real cause of most of this anx­i­ety. Because implied by the exhor­ta­tion to code are a bunch of unspec­i­fied assump­tions about code, its nature, why it is impor­tant, and why you should learn it. Some of these assump­tions are just wrong-head­ed and, as we are see­ing, divi­sive.

Let me start by giv­ing my rea­sons for why a per­son should learn to code if they want to ful­ly par­tic­i­pate in this DH thing that is hap­pen­ing right now and per­haps get hired in the field. The first is to learn for the rea­son that Tim Bern­ers-Lee exhorts jour­nal­ists to learn–you need to know how to use tools to manip­u­late data because knowl­edge is increas­ing­ly pro­duced as data, that is, in more or less struc­tured forms all over the web. This is because the future of the human­i­ties “lies with jour­nal­ists human­ists who know their CSV from their RDF, can throw togeth­er some quick MySQL queries for a PHP or Python out­put … and dis­cov­er the sto­ry lurk­ing in datasets released by gov­ern­ments, local author­i­ties, agen­cies, dig­i­tal archives, online libraries, aca­d­e­m­ic cen­ters, or any com­bi­na­tion of them – even across nation­al bor­ders.” That is, you should know how to write scripts in a lan­guage that lets you pull and munge and mess around with data, includ­ing texts. In this view, you don’t have to write full blown appli­ca­tions, and you don’t have to know a par­tic­u­lar language–a huge source of anx­i­ety, of which I say more below.

The sec­ond rea­son to learn to code is philo­soph­i­cal. You should be able to write code–not nec­es­sar­i­ly pro­gram or, God for­bid, “develop”–so that you can under­stand how machines think. Play with sim­ple algo­rithms, parse texts and cre­ate word lists, gen­er­ate sil­ly pat­terns a la 10 PRINT. Get a feel for what these so-called com­put­er lan­guages do. Get a feel for the propo­si­tion, to which I most­ly assent, that text is a kind of code and code a kind of text (but with real­ly impor­tant dif­fer­ences that you won’t dis­cov­er or under­stand until you play around with code). This lev­el of knowl­edge does not require any great mas­tery of a lan­guage in my view. It only requires a will­ing­ness to get one’s hand dirty, make mis­takes, and accept the lim­i­ta­tions of beginner’s knowl­edge.  I per­son­al­ly believe that this sec­ond rea­son is as or more impor­tant than the first. It will con­nect you with the heart of DH–the encounter with the machine–that, in my read­ing, Willard McCar­ty recent­ly described in his Busa Award accep­tance speech. (I just wrote some­thing about that here.)

I don’t mean these are the only rea­sons to learn code. I mean these are two very basic rea­sons that a dig­i­tal human­ist want­i­ng to code should con­sid­er. There are of course lots of oth­er rea­sons for peo­ple to go deep­er in the rab­bit hole. In my own case, I am turned on by data mod­els and the rela­tion­ship they have to the more gen­er­al human prac­tice of cul­tur­al mod­el­ing that defines our species. But that’s me.

Now, here’s the thing, the prob­lem that I have with the cul­ture of code in DH today: To get to this place with code, to be able write sim­ple scripts that are use­ful or inter­est­ing or both, you don’t need to do many of the things your cod­ing brethren think you should do. First and fore­most, you don’t need to learn a spe­cif­ic lan­guage unless there is a com­pelling local rea­son to do so, such as being in a class or on a project that uses the lan­guage. This is a real­ly impor­tant point. Many will tell you that you HAVE to learn Python or Ruby or R. But this kind of talk is off-putting and counter-pro­duc­tive for those want­i­ng to get into cod­ing. Many of these lan­guages are actu­al­ly pret­ty hard. Ruby, for exam­ple, forces begin­ners to under­stand things like sym­bols ver­sus strings as they are learn­ing what vari­ables and data types are. And R expects you to under­stand sta­tis­tics. The lan­guage you choose should real­ly depend on what you want to do, and if you want to do what I am describ­ing above, then many lan­guages will do. All oth­er things being equal, I sug­gest choos­ing a sim­ple lan­guage, one with­out strong opin­ions (like “Every­thing is an Object!”) and with­out the need to learn ten things before you can print “Hel­lo, World!” Also choose one with good online doc­u­men­ta­tion and a large user com­mu­ni­ty. PHP, for exam­ple.

Sec­ond, you don’t need to be involved in writ­ing a full-blown appli­ca­tion to do DH-wor­thy cod­ing. Appli­ca­tions are fine, and being on a col­lab­o­ra­tive project has huge ben­e­fits of its own, but know that appli­ca­tion devel­op­ment is a huge time-suck and that appli­ca­tions are like restaurants–fun to set up but most like­ly to fail in the real world. Lots of DH cod­ing projects in my expe­ri­ence are jour­neys, not des­ti­na­tions. Peo­ple get involved with them to learn how to code and to col­lab­o­rate and to have some­thing cool to show. But real, use­ful applications–like Zotero or Neatline–are devel­oped by full-time coders who know their busi­ness. And these devel­op­ers are not writ­ing dis­ser­ta­tions in his­to­ry or lit­er­a­ture. So don’t use these kinds of projects as a mea­sure for a gen­er­al DH cod­ing com­pe­tence lev­el. One oth­er thing to men­tion about appli­ca­tions, as long as I’m ranting–applications can be DH works in their own right, but the best ones are just use­ful to those con­duct­ing research in the human­i­ties per se, that is on cul­ture and his­to­ry and lit­er­a­ture. In the sci­ences, peo­ple write appli­ca­tions and give them away so that oth­ers can use them too. They don’t make careers out them. Dig­i­tal human­ists should do the same.

Third, there is no rea­son ever to be forced into using a spe­cif­ic edi­tor or cod­ing envi­ron­ment, espe­cial­ly if it is a dif­fi­cult one that “real” coders use. To be more spe­cif­ic: don’t let any­one tell you that you have to use vim or emacs. These are great edi­tors with good pedi­grees, but forc­ing them on new coders is akin to haz­ing. To the new coder, the edi­tor is just anoth­er thing to learn. New coders should use some­thing sim­ple that feels com­fort­able and does not require a man­u­al. Hav­ing said this, default edi­tors like Notepad and TextE­d­it are real­ly bad and should not be used. I give a sug­ges­tion below.

Beyond these spe­cif­ic prob­lems, though, there is a more fun­da­men­tal issue about the cul­ture of code that con­tributes to the con­di­tion that Miri­am and oth­ers con­front: in spite of the well-mean­ing desire by many coders to bring every­one into the cod­ing fold, there is a coun­ter­vail­ing force the pre­vents this from hap­pen­ing and which emanates from these same coders. This is the force of mys­ti­fi­ca­tion. Mys­ti­fi­ca­tion appears in many forms, includ­ing some of the things I just described–insisting on a dif­fi­cult edi­tor, diss­ing cer­tain languages–but it more gen­er­al­ly comes from treat­ing code com­pe­tence as a source of iden­ti­ty, whether it be per­son­al or dis­ci­pli­nary. As long as dig­i­tal human­ists regard cod­ing as a mark­er of prestige–and soft­ware as a token in the aca­d­e­m­ic economy–and not as a means to oth­er forms of pres­tige (such mak­ing dis­cov­er­ies or writ­ing books), then knowl­edge of cod­ing will always be hedged in by taboos and rites of pas­sage that will have the effect of push­ing away new­com­ers.

I see the effects of the soft­ware-as-pres­tige-good mod­el all the time, and it doesn’t just hap­pen among DH coders. It hap­pens when pro­gram­mers use the epi­thet “script kid­die” to dis­miss those who don’t write appli­ca­tions, or who are not for­mal­ly trained in com­put­er sci­ence or soft­ware engi­neer­ing. It hap­pens when HTML is not con­sid­ered code because it lacks con­di­tion­al log­ic, when in fact writ­ing HTML is an excel­lent entree into cod­ing for lots of rea­sons. And it hap­pens any time “Intro­duc­tion to Com­put­ing” is taught with Java as the lan­guage. For to begin with a lan­guage that forces data typ­ing, dec­la­ra­tion of vari­ables, class def­i­n­i­tions and instan­ti­a­tions, and all sorts of oth­er encum­brances on a new user is not to teach, it is to fil­ter.

I speak from some expe­ri­ence here. I teach code to about forty stu­dents, most­ly women, every year, in a course sequence on the Dig­i­tal Lib­er­al Arts. My approach is very simple–do as lit­tle as pos­si­ble to get between the stu­dent and the act of cod­ing. I teach them PHP, because it is dirt sim­ple and asks very lit­tle of them at first, and I have them use a basic cross-plat­form text edi­tor (JEd­it) that does not require them to learn key­board pat­terns to get start­ed. I tell them that coders are more like artists than engi­neers, fail­ing ear­ly and often, mess­ing around, throw­ing things away, and often sur­prised by their own results. I tell them that cod­ing is like writ­ing except when it’s not. And I teach them to do some­thing use­ful with their knowl­edge of code, like grab a CSV file and con­vert it into a net­work struc­ture that can be parsed by a net­work tool like ManyEyes, Gephi, or SHIVA, and dumped into an essay about some­thing. I get good reviews and many of my stu­dents go onto post-grad­u­ate pro­grams in relat­ed fields. What I am most sur­prised by, though, are not the Media Stud­ies stu­dents who, with­out pri­or expe­ri­ence, end the course with a con­fi­dence to code, but the com­put­er sci­ence stu­dents who tell me they feel lib­er­at­ed by the course, who feel like they’ve dis­cov­ered the beau­ty of code for the first time. In demys­ti­fy­ing code, I lib­er­ate stu­dents to engage with it more authen­ti­cal­ly. (Iron­i­cal­ly, I believe that the demys­ti­fi­ca­tion of code defa­mil­iar­izes it for CS stu­dents, who then see things dif­fer­ent­ly.)

So what I’m say­ing is this: DH coders should be doing every­thing pos­si­ble to demys­ti­fy cod­ing for new com­ers.  I think doing this would dis­pel a ton of bad will and shift atten­tion away from bias and toward the dif­fi­cul­ties of talk­ing to machines in the first place. And I don’t mean dumb­ing it down either. I mean begin­ning with the intel­li­gi­ble and pro­ceed­ing to the less intel­li­gi­ble and indeed more mys­te­ri­ous. The thing to remem­ber is that the code is the thing–not the lan­guage, not the devel­op­ment envi­ron­ment or the oper­at­ing sys­tem or the appli­ca­tion. If you are a new coder, any­thing that gets in the way between you and the abil­i­ty to make utter­ances in code is a prob­lem.

2 thoughts on “The Code Problem

  1. Pingback: 5 Things Thursday: Adaptive Metadata, Coding, Archives | MOD LIBRARIAN

  2. Lisa Spangenberg

    While I would absolute­ly encour­age every­one of every pos­si­bly field to learn to code, I don’t think cod­ing in the sense of a com­piled lan­guage, or even script­ing in the sense, say, of Apple Script or JavaScript, or shell scripts is nec­es­sary.

    I think a suf­fi­cient facil­i­ty with any lan­guage the abil­i­ty to parse it at the lev­els of syn­tax and gram­mar as well as con­text is a fair sub­sti­tute. I would also accept a suf­fi­cient lev­el of util­i­ty and under­stand­ing of HTML/XML and CSS, or even a sophis­ti­cat­ed under­stand­ing of Boolean log­ic, as quite ade­quate.

    I am a human­ist. I also have been work­ing in the soft­ware indus­try since 1989. You don’t have to write code to under­stand it, any more than you have to be able to write Medieval Latin or Mod­ern Ger­man in order to under­stand it in writ­ten or even spo­ken forms.

    What we want in human­ists are pat­terns of thought and kinds of ana­lyt­ic skills used in cod­ing to thrive right along with more tex­tu­al and tra­di­tion­al Human­is­tic skills. The skills involved in pars­ing a human lan­guage and flip­ping back and forth from the micro lev­el of mor­phol­o­gy and gram­mar to the macro lev­el of con­text and tex­tu­al sur­face are strik­ing­ly sim­i­lar to the code/UI or structure/content or even the Lan­ham vari­ant of At/Through.

    As some­one hir­ing and work­ing with soft­ware engi­neers and pro­gram­mers that in very broad terms, those engi­neers and pro­gram­mers that were par­tic­u­lar­ly skilled and joy­ful in their approach to code (and human­ists) tend­ed to have addi­tion­al extra­or­di­nary lev­els of exper­tise in either human lan­guages, or music.

Leave a Reply

Your email address will not be published. Required fields are marked *