A Computer Scientist’s Guide To Henry James’s Never-Ending Sentences

Anthony Barr |
Jonathan Reeve used mathematical models to create a unique literary analysis of Henry James’s writing.

The great 19th-century American novelist Henry James is known for his sentences, specifically for the length of his sentences, which are often the size of paragraphs. While some grammar teachers bristle at the thought of a “run-on” sentence, James’s writing shows us that, as long as all the grammatical parts are working together properly, sentences can effectively spill into several lines without losing the reader’s attention or comprehension.

Jonathan Reeve is a PhD candidate studying English and Comparative Literature at Columbia University. Reeve approaches literary studies through the discipline of the digital humanities. His particular field of computational literary analysis uses mathematical models to produce data that helps us think about literature in unique ways. His project, The Henry James Sentence: New Quantitative Approaches, reveals some fascinating details about James’s literary work.

Reeve asks us to read the following sentence from James’s novel Portrait of a Lady:

The house had a name and a history; the old gentleman taking his tea would have been delighted to tell you these things: how it had been built under Edward the Sixth, had offered a night’s hospitality to the great Elizabeth (whose august person had extended itself upon a huge, magnificent and terribly angular bed which still formed the principal honour of the sleeping apartments), had been a good deal bruised and defaced in Cromwell’s wars, and then, under the Restoration, repaired and much enlarged; and how, finally, after having been remodelled and disfigured in the eighteenth century, it had passed into the careful keeping of a shrewd American banker, who had bought it originally because (owing to circumstances too complicated to set forth) it was offered at a great bargain: bought it with much grumbling at its ugliness, its antiquity, its incommodity, and who now, at the end of twenty years, had become conscious of a real aesthetic passion for it, so that he knew all its points and would tell you just where to stand to see them in combination and just the hour when the shadows of its various protuberances—which fell so softly upon the warm, weary brickwork—were of the right measure.

This is the single longest sentence in James’s novels. And Reeve observes that “like the house it describes, it is copious, labyrinthine: an architectural wonder.” Reeve further notes that “the sentence begins by traversing several centuries in a just a few words, continues by covering a twenty-year personal history, and finishes with a cadenza that elongates time just enough to enable us to savor the certain slant of light illuminating the dear old house.” Reeve’s computational approach to this sentence allows him to quantify certain features of the sentence. In particular, Reeve uses dependency parsing, “a method of computational linguistics and natural language processing that algorithmically infers syntactic dependencies between words in a sentence. Adjectives that describe a noun, for instance, are graphed as the noun’s dependents.” If that doesn’t sound like English to you, don’t worry—just look at the incredible diagram this approach allows us to create (make sure you zoom in!).

Reeve points out that this diagram lets us see “both the relative balance of the sentence—its tidy list-like history—as well as its ultimate imbalance—its descent into the realm of the slow, minute, and sensory world of shadows and warm bricks.”

As another example, here’s a sentence from James’s novel The Bostonians:

This edifice, a diminished copy of the chapel of King’s College, at the greater Cambridge, is a rich and impressive institution; and as he stood there, in the bright, heated stillness, which seemed suffused with the odour of old print and old bindings, and looked up into the high, light vaults that hung over quiet book-laden galleries, alcoves and tables, and glazed cases where rarer treasures gleamed more vaguely, over busts of benefactors and portraits of worthies, bowed heads of working students and the gentle creak of passing messengers–as he took possession, in a comprehensive glance, of the wealth and wisdom of the place, he felt more than ever the soreness of an opportunity missed; but he abstained from expressing it (it was too deep for that), and in a moment Verena had introduced him to a young lady, a friend of hers, who, as she explained, was working on the catalogue, and whom she had asked for on entering the library, at a desk where another young lady was occupied.

Reeve describes this sentence as a “journey through space, an admiring pan through the library that nonetheless causes him deep ‘soreness.’” Reeve asks us to observe how the sentence moves from description to action as seen in the clause, “in a moment Verena had introduced him to a young lady.” Reeve writes, “the fluidity of this transition is underlined by the immediacy signaled by ‘in a moment,’ which indicates that a sharp temporal shift has taken place. Time, that had been allowed to flow aimlessly and viscously across the objects of the library, now, ‘in a moment,’ snaps back into place, and we again reach the staccato rhythm of action: ‘a friend of hers / who / as she explained.’”

Here is the diagram of the sentence, based on the dependency parsing.

Using the computational approach, Reeve begins to ask questions like, “did James use longer sentences more in his earlier novels or his later novels?” and, “are there statistically higher uses of rare words in James’s longer sentences than his shorter ones?” Reeve plots out the data in scatter graphs and bar charts, allowing him to visually see Using the computational approach, Reeve begins to ask questions like, “did James use longer sentences more in his earlier novels or his later novels?” and, “are there statistically higher uses of rare words in James’s longer sentences than his shorter ones?” Reeve plots out the data in scatter graphs and bar charts, allowing him to visually see patterns emerge. While this might seem like an odd and perhaps even unproductive approach to literary analysis, it allows Reeve to note thematically significant features of James’s writing. For example, he notes:

The most distinctive lemmas of James’s long sentences include many that seem appropriate to aesthetically sensible, objective descriptions. There are architectural words, such as place and room; the lemmas [word groupings] distinctive of very long sentences add window and light. These lists are full of adjectives describing size or magnitude: great, small, high, low, and little. Markers of time like hour, second, evening, and occasion are also here. The sensory words sense and feel appear in these lists, alongside the more legal lemmas particular, effect, and fact. There are also indicators of a curious, interrogative mood in question and interest.

On the other hand, James’s short sentences are devoted to action in the external world, defined by action verbs like: “say, speak, tell, ask, look, and come. There are also the cognitive states of think and know, the anticipatory want and shall, and finally the almighty verb be.

Ultimately this approach to literature and grammar can never replace the power of closely reading a text, and being moved by the power of its prose. But the digital humanities as a discipline can help us to see texts in a new light. Reeve’s project is well worth exploring, for both its beauty and its insights.