Origin in U.S. Civil War Survival Analysis
Problem
Investigators studied mortality within the Confederate North Carolina 30th Regiment during the American Civil War. They began by compiling a database of variables for each of 1500 men, from the date of enrolling in the regiment to the date of leaving, for whatever reason.
Solution
Used the Kaplan-Meier Estimator to estimate the survival function for the regiment, comparing death rates by infectious disease to those from combat-incurred injury.
Used logistic regression to analyze mortality data, stratified according to those independent variables that could reliably be distinguished within the historical record such as age, whether someone was a "subsitute" (paid in cash to serve in a rich man's stead), etc.
John Cameron, Ph.D. and Robert Hirsh, M.D. are, respectively, Adjunct Faulty member, Dept. of History at Old Dominion University in Norfolk, VA and Assistant Professor of Anesthesiology at Cooper Medical School of Rowan University in Camden, NJ. Hirsh is a Group Leader for the Scholar's Workshop at Rowan University, in which physicians-to-be study such topics as epidemiology, biostatistics, including inferential statistics, information literacy, history of medicine, research methodology and ethics, and applied mathematics.
Cameron and Hirsh were interested in studying the mortality of a select group of soldiers from the American Civil War, in an attempt to understand how factors such as age of enlistment or conscription, military rank, and social status affected survival odds. In this study, Cameron and Hirsh compiled data for each of 1500 soldiers of the North Carolina 30th Regiment. The NC 30th was organized near Raleigh, NC, in October, 1861 and fought from June of 1862 until the Confederate surrender at Appomattox, Va in April of 1865, as part of the Army of Northern Virginia.
“ What we were interested in here were odds of survival. The
Kaplan-Meier analysis showed exactly what I expected: the men were far more likely to die of disease than of battle. ”
Kaplan-Meier Survival Analysis
John Cameron: "It is well known that in all modern wars prior to 1900 more men died of disease than of combat. We wanted to determine what the actual situation was. We chose to investigate mortality within the North Carolina 30th. We began by building as accurate a dataset as possible to track what happened to each of those 1500 men from the date of enrolling in the regiment to the date of leaving for any reason. We now had data that was as accurate as possible on how long each man had served and what his fate was."
"So, with the help of OriginLab staff, we used Kaplan-Meier analysis on the mortality-with-time dataset to compare mortality due to combat-incurred injury vs. mortality due to infectious disease. The Kaplan-Meier analysis showed exactly what I expected: the men of the 30th were far more likely to die of disease than of battle (reflected in the steeper decline of the red plot in the figure to the right)."
Analysis showed 344 men dying of disease and 218 dying from combat-incurred injury. Log-rank, Breslow and Tarone-Ware tests of the null hypothesis all indicated that there was a significant difference in death rates from disease vs. injury.
Applying Logistic Regression to Survival Data
In the second phase of their study, Cameron and Hirsh wanted to examine how a soldier's life-circumstances might have affected his chances of survival.
“ For each year older than 18 at enrollment, a solder was about 3.4% less likely to survive. ”
Cameron: "The problem was how to best describe these men. It seemed wise to ignore traditional descriptions of Poor Whites, Yeoman Farmers and Planters, as being so vague as to be practically meaningless. North Carolina was predominantly a state of small scale agriculture and the men of the 30th reflect this. Additionally, many of the volunteers in 1861 were aged 17-25. The 1860 census returns show that a high percentage were unmarried and still lived at the family home. In such cases, I have assigned these men the wealth of the head of the family since that would return a far more accurate picture of their relative status.
"We came up with these totals for those primarily engaged in agriculture or hand labor of some sort: Laborers 335 men (29.5 per cent); Small Farmers and overseers 413 (36.4 per cent); 156 men (14.4 per cent) were Middling Farmers and 56 (5 per cent) were Wealthy Farmers or planters. Thus, 85.3 per cent of the men of NC 30th were primarily engaged in agriculture at some level.
"Only 14.7 per cent of the men in the NC 30th made their primary living from non-agricultural work and even some of those, physicians and lawyers for example, also owned land and slaves and profited from agriculture. These men easily fit into the same patterns as their agricultural friends. We felt it reasonable to divide these men among the majority agricultural classes. Thus, we arrived at the following: 29.3 per cent of the 30th were classified as Laborers; 44.3 per cent as Small Farmers (including craftsmen); 18.3 per cent as Middling Farmers; and 7.4 per cent as Wealthy Farmers (including professionals).
Expanding on the original mortality dataset, researchers compiled statistics for a number of independent variables:
- Whether or not the soldier was a conscript
- The age of enrollment in the regiment
- Whether the soldier was an officer
- Whether the soldier was a "substitute"1
- The value of the soldier's assets as determined by tax rolls
- The social class of the soldier
- Whether the soldier was from an agricultural background
- Number of slaves owned, if any
- Marital status
Using OriginLab's add-on Logistic Regression App, the researchers performed binary logistic regression analysis on the compiled statistics. "Logistic regression indicated some interesting data on who was more or less likely to survive. For example, officers were more likely to survive. For each year older than 18 at enrollment, a solder was about 3.4% less likely to survive."
In summary, logistic regression showed that, when all other factors were equal:
- Enlistment age factored significantly in a soldier's chances of survival. For each year older than 18 at enrollment, a soldier's odds of surviving decreased by 3.4%.
- Enlisted men were 43% as likely to survive as Officers were.
- Volunteers' chances of survival were only about half (53%) that of conscripts.
- Middling farmers were the only social class to survive at a lower rate than other classes. Their chances of survival were only 42% of those of Wealthy Farmers.
Historical Interpretation of the Survival Data
Cameron: "There is no easy reason why Middling Farmers should have had a lower survival rate than other social groups. However, consider the fact that a large proportion of Wealthy Farmers were officers and indeed many were high ranking officers. The numbers show that officers had a higher survival rate than enlisted men. Also, I think we would find that a great many of the initial volunteers were Middling Farmers or Small farmers. If the ratio of volunteer to conscription was higher for Middling Farmers than for other groups, it is more understandable that they had a lower survival rate.
Military Rank, Broken Down by Social Class
|
Privates |
NCO |
Lieutenants |
Captains |
Field and Staff |
Laborers |
36.3% |
11.8% |
12.2% |
|
|
Small Farmers or Craftsmen |
43.8% |
42.5% |
22.4% |
10% |
|
Middling Farmer |
15.5% |
40.7% |
36.7% |
50% |
|
Wealthy Farmer or Professional |
4.4% |
8.5% |
28.6% |
40% |
100% |
Totals |
100% |
100% |
100% |
100% |
100% |
“Being a substitute was a way for a poor man to provide for his family, something of a life insurance policy. They were given money equal to 5-10 years wages for a laborer.”
"At some point we ran survival for substitutes. This was a relatively small component of the NC 30th and while the P-value (0.353) was found "NOT significant" at the 5% level (no survival difference between Substitute vs. Volunteer/Conscript), we know from historical accounts that these soldiers generally had abysmal survival rates. Many were among the oldest of enlistees. Most were from the lower strata of society. Doubtless, in many cases, they were not only old but in very poor health. Bob Hirsh and I think that these men, who often were married with large families, probably did not necessarily think they would survive. Being a substitute was a way for a poor man to provide for his family, something of a life insurance policy. They were given money equal to 5-10 years wages for a laborer. In one case that I found, a man who was initially a volunteer, paid a substitute to replace him, then took on the obligation of caring for the substitute’s children when the substitute died.
"Why did conscripts survive so much better than volunteers? The first conscripts were drafted in the spring of 1862. Large number of volunteers (2/3 or so of the total enlistments) were already enrolled. Other men were conscripted later in 1862 and again in 1863 and 1864. Each time the numbers fell dramatically from the previous draft for the pool of available men was so much smaller.
"This means several things: First, volunteers were already dying in large numbers of disease before the first conscript arrived. Second, many, if not most conscripts were reluctant to be in the army. Some were openly opposed to the war and were forced to join at gun point. They were, therefore, very prone to desert, to fake illness or to find any way to avoid combat and when possible escape the army altogether. By 1864, even volunteers were prone to desert and often had turned against the war; but like soldiers in all wars they (volunteers) had strong loyalty to their comrades and were willing to tolerate more.
"It makes sense that officers survived better. True, that in the Civil War large numbers of officers, including generals, died but generally the higher the rank the less the man was under direct fire and certainly almost never in hand-to-hand combat.
"Finally, the data on death from disease vs. battle injury is important. It stands to reason that men who were older and weaker, were more prone to disease. The longer a man was in service the more his chances of contracting a deadly disease.
“The state of health of many volunteers in 1861 was abysmal...the regiment had lost nearly 10 per cent of its strength before its first battle at Gaines Mill, for reasons of health.”
Summarizing the overall physical condition of these men, Cameron had this to say: "The state of health of many volunteers in 1861 was abysmal. For the first six months the regiment apparently had no official surgeons, and physicians informally spent what time they could with the men to treat disease2. As the 30th was forming, these physicians rejected dozens of men as being unfit3. Many of those who passed the muster were healthy only by comparison to those rejected. They were of a feeble constitution and unable to bear the strain of training. By mid-June 1862 54 men, roughly 5 per cent had been discharged for debility, sickness or other weakness along with one as under-aged and one as over-aged and one for larceny4. When combined with men dead of disease during the same period it means the regiment had lost nearly 10 per cent of its strength before its first battle at Gaines Mill, for reasons of health5."
Footnotes
1. One interesting footnote to Civil War military service involved the use of "substitutes" - generally draft-ineligible men who were hired by a conscript to serve in his stead. Though highly controversial, hiring of substitutes was common practice in both the Confederate and Union armies. Substitutes were paid hundreds and sometimes thousands of dollars by their "principals" -- a large sum of money in a time when the average wage for a day laborer in the state of North Carolina was $0.54 a day, with board (see ncpedia.org and thecivilwaromnibus.com).
2. For example Dr. J. M. Campbell was often with Company H. He was born in 1835, lived in the Buffalo section of Moore County (now Lee County). In 1860 he lived with E. H. Cook a 45 year old woman, perhaps his house keeper and owned $2000 in personal property.
3. NC Troops gives the names of many of these men. They were not included in this study because they were never a part of the unit.
4. General debility 12 men; Discharged no reason given but certainly health 9 men; Tuberculous variously described 8 men; Chronic rheumatism almost certainly malaria 7 men; injuries 4 men; hernias 3 men; Heart disease 2 men; Insanity 2 men; one man each caxalgia, spinal disease, gonorrhea, cancer of the stomach, tuberculous of the throat, typhoid fever.
5. The actual loss of personnel was even worse than 10 per cent. Eight men had transferred to another unit; twenty-six (mostly officers not reelected in April-May, 1862) had resigned; eight had provided substitutes and left.