There has recently been a riot over Georgia's approach to voter registration. The United Kingdom Law passed last year (19459009) requires citizens' names on their government-issued passports to be exactly the same as their names listed on the electoral roll. If the two do not match, that person's voice would not count. The Georgia NAACP and other civil rights groups have filed a lawsuit arguing that the measure in force since July 2017 is aimed at disenfranchising racist minorities in the upcoming interim elections.
Georgia Foreign Minister Jack Kemp, a Republican campaigning for the governor against the Democrat Stacey Abrams, has so far released more than 53,000 voters as the names in their ballots do not match and other sources of identification such as driving licenses and social security cards exist , If the measure becomes effective, voters whose information in the sources is not exactly the same must present a valid photo ID on election day. This could suppress turnout, either because some voters lack ID or because voters are not sure if they are eligible. Proponents of the rule claim that it is only meant to prevent illegal elections .
But missing a hyphen, an initial instead of a full middle name, or just a discrepancy in a voter's letter, do you give good evidence that the voter is not who they say he is? How can we know that?
Researchers often have to compare records – and they have to do it right
Researchers often ask this question. In empirical scientific research, they often have to combine different amounts of data with an incomplete identifier – such as agency names or individual addresses. While this can be tedious, finding the right games is important. Assign the wrong records, and any analysis may be completely unreliable. As a result, many data analysts only maintain exact matches.
But even though mismatches can cause problems, records may fall that should match but have small discrepancies. Eliminating these records can also damage an analysis.
That's why I spent the last three years developing an algorithm that uses a probabilistic record link named "fastLink" that not only links records across datasets. Fast and automated, but also the analyst how likely it is that an inaccurate match of two records is actually correct.
In a recent study written together with colleagues Ben Fifield and Kosuke Imai, we turn to the algorithm on the question of voter identification. The results raise serious concerns about the exact match law of Georgia – and its likelihood of preventing tens of thousands of valid voters from voting.
How We Researched
We worked on linking two nationwide voter files from 2014 and 2015 collected from L2 Inc a national bipartisan firm that provides voter data and associated technology for campaigns. All active voters in 2014 appeared in the 2015 data set – which means that we knew there was always a true match. But many records had typographical discrepancies that prevented exact matches.
Our analysis found that the "exact match" approach would connect only 66 percent of voters who were actually identical, with approximately 91 million voters identified correctly. In other words, exact matching would exclude nearly 40 million records that actually refer to the same voter and denounce some Americans.
What does that mean for the voters of Georgia?
Georgia's records had a higher proportion of Exact matches than we found nationwide – but 30 percent of actual voters still could not agree exactly in this state.
In contrast, we use our algorithm, which correlates almost perfectly with L2's internal match records (r = .99). In 2014 data, we can reach nearly 127 million registered voters – or 93 percent of all voters. Among those whose records did not agree exactly, we found that 25 percent have at least 99 percent probability of having correct matches, while 28 percent have a probability of at least 95 percent.
Using our algorithm, in other words, 91 percent of those on Georgia's electoral roll would be released for voting, or 3,941,342 voting citizens, while "accurate matching" releases only 70 percent and deprives 909,540 eligible citizens.
I also tried to connect voters in the 2016 US National Election Study (ANES) with voter records in the L2 data using two methods: exact match, and an improved version of fastLink I recently developed .
The results appear in the table below. As you can see, the exact matching method lacks a significant amount of valid matches. While our algorithm validated 60 percent of voter records, exact matching confirmed on average less than 30 percent.
And in line with the fears of opponents of the Georgian measure, especially white voters are likely to be disenfranchised. The match rates are exactly nine and six percentage points lower for black and Hispanic voters than for white voters.
Georgia's "exact match" law is the latest in a series of voter identification measures that critics claim are veiled voter-suppression tactics. Whether intended in this way or not, the Georgian "exact match" rule will disproportionately prevent minority voting rights from casting their ballots.