It is currently Tue Dec 07, 2021 1:47 pm


All times are UTC - 8 hours [ DST ]




Post new topic Reply to topic  [ 1 post ] 
Author Message
 Post subject: Propublica releases info about drug company payments to docs
PostPosted: Tue Jan 11, 2011 6:25 pm 
User avatar

Joined: Fri Nov 05, 2010 11:20 am
Posts: 16
Location: Portland, Oregon
Real Name: Martin Mendelson
Began Programming in MUMPS: 15 Feb 1983
The organization ProPublica has been tracking drug company payments to docs who give talks and "continuing education" presentations for other docs. The idea is for patients to be able to know whether their doc is one of the "educators" involved. Their latest version of the data lists 384 docs who received more than $100,000 in the last several years - check it out at

http://projects.propublica.org/docdollars/top_earners

The folks at ProPublica aggregated and cleaned their data with a tool available from Google called Google Refine, which you can check out here

http://code.google.com/p/google-refine/

The only problem is that either Google Refine doesn't cut it, or the folks at ProPublica didn't use it properly.

An academic colleague brought me the original dataset that was used and asked if it were possible to reformat it for some research she is doing: what was wanted is a simple table with an extract of the data and per doc totals. I opined that it could be done with MUMPS and set out to attempt it. Because the data had been put out by seven companies with no standard, names might be in upper case or mixed case and might or not include middle names or initials. First names might be complete - like "Christopher" or just "Chris". Worse yet, there were several instances where last and first names were identical but but were located on opposite sides of the continent.

Naturally, MUMPS was up to the challenge. Converting all names to upper case and then indexing each entry by the last name, first three letters of the first name and the state, it was possible to extract a bit over 18,000 unique docs from a list of over 35,000 payment records, determine how much each doc got from each drug company and his or her total payment.

Well, it turns out that there are actually 610 recipients of more than $100,000 - not 384! How did they get missed? Mostly because they got paid by more than one company and the reporting formats were different. I don't know whether Google Refine can't deal with this or the ProPublicans didn't utilize it properly. In any event, good old MUMPS came through.


Top
Offline Profile  
 
Display posts from previous:  Sort by  
Post new topic Reply to topic  [ 1 post ] 

All times are UTC - 8 hours [ DST ]


Who is online

Users browsing this forum: No registered users and 2 guests


You cannot post new topics in this forum
You cannot reply to topics in this forum
You cannot edit your posts in this forum
You cannot delete your posts in this forum
You cannot post attachments in this forum

Search for:
Jump to:  
cron
Powered by phpBB © 2000, 2002, 2005, 2007 phpBB Group
Theme created StylerBB.net