Excluding Comics

Given that the central drive of our research project is to define what was “typical” of American comic books over the past eight decades, we need to begin with a working definition of what is an “American comic book”. We have already signalled that problem with a discussion of the Spirit supplements that began in the 1940s, but there are plenty of other occasions where this issue raises its definitionally ugly head.

In order to define typicality, it is our goal to randomly sample two per cent of the comic books published in the United States. We are not interested in “the best” works but wish to work from a completely neutral sample set. This requires several steps:

1. Define what we mean by a “comic book"
2. Determine how many “comic books” were published in each year
3. Randomly select two per cent of those to form a data set
4. Track down electronic or physical copies that are in the data set

One interesting thing that I’ve already discovered is that it is somewhat easier to do step two than it is to do step one.

Right now, we are extremely fortunate that a tool like the Grand Comics Database exists. This volunteer project has been cataloguing comics in the United States and around the world for a long time, and it is a remarkably flexible and searchable database. Over the past few days I have been running some experiments using the GCD data using their Advanced Search functionality where I can set country of publication to the United States and delimit an annual date range very easily. Let’s consider an example:

Setting the date range to 1970 returns 1,635 items. Even right there on the first page of results, however, we run into a couple of issues. Scrolling down we see Archie and the Generation Gap and Archie’s Girls, both published by Bantam Books. Since I have own both of these as a result of writing Twelve-Cent Archie, I can tell you that these are not “comic books” in the sense that we are using that term - they are books that we would like to exclude.

So how to best do that? The simplest way would be to refine our search based on format, by narrowing it to comics that are “saddle stitched” (to use a GCD term). This reduces our count from 1,635 to 1,432 and it eliminates the Bantam Books. Success! But, what if we have eliminated too much with this search? After all, many giant size comics (like Annuals) were square bound. Indeed, a search with “square bound” as the format turns up 411 comics (certain titles, like Action Comics, are listed with both formats). That’s not ideal. Similarly, searching with “cardstock” as the paper stock catches the Bantam Books and others (44 in total). At first glance, it looks like all of those should probably be excluded.

Beginning with “saddle stitched” isn’t a great deal of help because the GCD likely includes magazines that we might want to exclude. We could refine by searching on “standard silver age size” - this brings up 1,065 results - but that makes me wonder how we could have possibly excluded such an enormous proportion of our initial search (almost 600 issues). The GCD has a more expansive definition of comics than what we are using, which is great, but finding exactly the information that we need becomes a challenge.

There is one obvious solution: Review all the data by hand. Simply begin with the largest number and then check the entries and do the exclusions manually. I did that, for example, for 1935.

Running our basic search for 1935, we get 41 results. This seemed intuitively high to me based on my knowledge of comics history. Indeed, some results seemed to need to be scrubbed right away. The Seventh New Yorker Album, is a hardcover collection of New Yorker gag cartoons and is not what we would conventionally call a comic book. It’s out. The strip collections published by David McKary (Popeye, Henry, Little Annie Rooney) similarly don’t fit our conventional definition - they’re out. The Tijuana Bibles that have been included here should be out. Finally, the first result is eight issues of Ballyhoo, a Dell published humour magazine that was an important forerunner of Mad. It’s a fascinating magazine (read about here), but it doesn’t fit our initial working definition.

In the end we are left with New FunNew Comics, Mickey Mouse Magazine (a semi-borderline case because of the heavier use of non-comics pieces), Famous Funnies, and Big Book of Fun Comics (which has a cardboard cover, but which otherwise seems to fit). That is 24 “comic books” out of an initial sampling of 41. 

In the end, this result doesn’t change much. Two per cent of 41 is less than 1 book and so is two per cent of 24 (we will round up to one for the year). Eliminating titles by hand wasn’t difficult, but the sample size is tiny. By contrast, for 2014, for example, the GCD returns 8,793 results - that will take a lot longer to clean by hand. The very first result shows one of our problems: the first issue of the Image comic book ’68 Homefront is listed three times due to variant covers. That’s a really simple item to scrub out by putting “variant” into the search (an astonishing 1,799 results!). Still, that only brings the initial set to 7,000 entries - an awful lot to cull by hand.

You’ll note that even after all of this work we still won’t have defined “American comic book” format either. Still a lot of work to do before we can even begin!