Male/Female Instance Distance Analysis

The code used for this analysis can be found in

The raw analysis returns a dictionary with each novel mapped to an array of 3 lists:

  • Median, mean, minimum, and maximum distances between male pronoun instances.
  • Median, mean, minimum, and maximum distances between female pronoun instances.
  • For each of the above stats, the difference between male and female values (male stat – female stat for all stats). Positive difference values mean there is a larger distance between male pronouns than female pronouns and therefore higher female frequency.

For example:

A median female instance value of 14.0 vs a median male instance distance value of 10.0 means that female instances tend to occur every 14 words in the novels versus male instances occurring every 9 words in the novel. This would indicate a greater male presence in the novel.

A mean difference value of 4.0 means that on average, the distance between male instances is 4 words longer than female instances.

By Author Gender:

Average Median Male Instance Distance:

Male Authors: 12.61

Female Authors: 13.79

Average Median Female Instance Distance:

Male Authors: 72.30

Female Authors: 30.17

p-value = 0.0184

By Decade:

Average median female instance distance in each decade

p-value = 0.942

By Location:

Average median female instance distance in each location

Published in England : 63.04

Published in US : 20.59

Published in other country : 62.02

p-value = 0.317

Novels With Greatest Instance Distances

Top 10 novels with greatest median female instance distances:

  1. Don Hale with the Flying Squadron by William Crispin Sheppard
    Median female instance distance: 19713.5

  2. The Head of Kay's by Pelham Grenville
    Median female instance distance: 15962

  3. Teddy and Carrots: Two Merchants of Newpaper Row James Otis
    Median female instance distance: 9906

  4. The Outdoor Chums in the Big Woods by Quincy Allen
    Median female instance distance: 9804

  5. The Boy Scouts at the Canadian Border by John Henry Goldfrap
    Median female instance distance: 9507

  6. Our Young Aeroplane Scouts in Germany; or, Winning the Iron Cross by Horace Porter
    Median female instance distance: 8671

  7. The Keepers of the Trail: A Story of the Great Woods by Joseph Alexander Median female instance distance: 7551.5

  8. Storm-Bound; or, A Vacation Among the Snow Drifts by Captain Alan Douglas
    Median female instance distance: 6335.5

  9. Tom Slade's Double Dare by Percy Keese Fitzhugh Median female instance distance: 5654

  10. The Wonder Island Boys: Conquest of the Savages by Roger Thompson Finlay Median female instance distance: 5091

Top 10 novels with greatest median male instance distances:

  1. Marjorie's Busy Days by Carolyn Wells
    Median male instance distance: 43.5

  2. The Mary Frances Cook Book by Jane Eayre Fryer
    Median male instance distance: 39

  3. The Motor Girls on the Coast; or, The Waif From the Sea by Margaret Penrose
    Median male instance distance: 36

  4. The Adopting of Rosa Marie by Carroll Watson Rankin
    Median male instance distance: 34

  5. Ruth Fielding At College; or, The Missing Examination Papers by Alice B. Emerson
    Median male instance distance: 33.5

  6. Mary Jane in New England by Clara Ingram Judson
    Median male instance distance: 32

  7. Wanted: A Cook by Alan Dale
    Median male instance distance: 32

  8. The Armed Ship America by James Otis
    Median male instance distance: 31

  9. The Mary Frances Garden Book; or, Adventures Among the Garden People by Jane Eayre Fryer Median male instance distance: 31

  10. The Mary Frances Knitting and Crocheting Book by Jane Eayre Fryer Median male instance distance: 30


Overall, both male and female authors have greater median female instance distances than male instance distances. However, there is a significant difference between median female instance distance in novels written by female authors, and those written by male authors. There is evidently no significant trend with relation to date or location of publication.

Looking at the list of top-10 novels in each measured category, it is important to note that the highest male instance distances are far smaller than the highest female instance distance. The top 10 median male instance distances are in the range of 30-45, while the top 10 median female instance distances are in the range of 5091-19713. This may indicate that even the female-centric novels still have a male presence, while male-centric novels exclude female characters to a greater degree.