I am working on a general organisation identifier index, particularly to work with IATI data. I have located some online sources but I am missing a good source for US Charity and not-for-profit organisations

asked 22 Mar, 17:16

kitwallace's gravatar image

accept rate: 13%

OpenCorporates is one possible source but coverage of charities and non-for-profits seems to be quite poor. One rather indirect source is the Economic Research Institute but the information there is rather limited.

Boh sites are used in the prototype organisation service with a prefix of US-EIN


answered 22 Mar, 17:23

kitwallace's gravatar image

accept rate: 13%

First you can start by going to the IRS webpage for exempt organizations here:

http://www.irs.gov/taxstats/charitablestats/article/0,,id=97186,00.html Those zip files can be found inside this open directory: http://www.irs.gov/pub/irs-soi/

Those are detailed files regarding Non-Profits and also Private Foundations which are exempt.

The IRS also has epostcard downloads which has the main contact information, think of it as a contact list with EIN numbers instead of the more in-depth SOI files.

If you really want more information then you can always head over to http://bulk.resource.org/irs.gov/eo2/

Those are the actual tax returns from 2002 through 2007 for ALL Exempt organizations.

The folders ending in PF means Private Foundations like Rockefeller, Gates, Ford Foundation and so on. The ones without the PF are standard Non-Profit organizations.

This is a massive dataset, when I first downloaded them it took my computer more than a week to move and rar all the files for archiving purposes. Since then this site has placed each month into massive rar files so you have to download ALL or nothing. If you want an alternative then go to:http://foundationcenter.org/findfunders/990finder/

This site has links to ALL the files from resource.org but here you can downlaod a single year or a single Corporation instead of having to download entire large rarred datasets.

You need to be more specific though on the type of Charity Non-Profit you are looking for. There are multiple types, from Foundations, to Religious, Churches, Political and so on.

You say you are working on a "general organisation identifier index" what exactly do you mean by this? Askign because we may be working on the same thing


answered 23 Mar, 10:43

Data9er's gravatar image

accept rate: 0%

Thanks for the info - those datasets look rather daunting!

My prototype service is http://opencirce.org/org . At present I'm focusing on supporting the organisation identifiers used in IATI data . For US Charities this service uses the Economic Research Instute data to obtain the name.


(23 Mar, 10:52) kitwallace

Actually I think you are making it more complicated than it really is. Try it this way.

Go here: http://apps.irs.gov/app/eos/forwardToEpostDownloadLayout.do

That is the layout of the e-Postcard data provided by the IRS. There are 26 fields inside that datafile. The most important being the EIN!

If you really wanted to this datafile could be split into several tables.

Then simply download the ZIPPED Tabbed TXT file here:


Now import that data into a mysql or other database format.

You now have the main IRS Exempt Organization database.

If you want to get more in-depth then you can link the EIN to:

http://www.eri-nonprofit-salaries.com and http://foundationcenter.org/

You do this by generating the direct URLS to the actual EIN like this: http://www.eri-nonprofit-salaries.com/index.cfm?FuseAction=NPO.Summary&EIN=631222148



So basically your php/html script has the base urls and appends the EIN into the form.

So instead of creating 2 new fields in your Database 1 for each url, it would be the html/php details form that would have the urls and simply inserting the EIN into it. This keep ur DB size down as generating those 2 urls for every record would take up considerable size and there is no need for it.

so your php form would have something like this inside it:

foundationcenter = baseurl(http://dynamodata.fdncenter.org/990s/990search/esearch.php?990_type=&ei=)+($ein)+(&fy=&action=Find);

eri = baseurl(http://www.eri-nonprofit-salaries.com/index.cfm?FuseAction=NPO.Summary&EIN=)+($ein);

Make those two urls open in a new window and walla, you now have the IRS data mapped to both foundationcenter and ERI as well as any other source you want.

You could even auto generate a link to the organizations state Corporation registry doing the same, instead of using the EIN though you would supply the NAME as the search string.

WARNING: If you make a db like this that is public, always make sure that you release no follow the outside urls so that google and other spiders do not index stuff on the other site especially when they have that information excluded in their robots file.


answered 23 Mar, 13:21

Data9er's gravatar image

accept rate: 0%

edited 23 Mar, 13:30

Thanks for that link to the Foundation Centre. Actually I'm not developing a database, merely a set of descriptors which allow me to create links to related resources based on the codes - so I am in fact generating URLs as you suggest. For some small amounts of data like name I scrape the data on demand. In fact I'm using the ERI site for just this purpose.

(23 Mar, 13:49) kitwallace
Your answer
toggle preview

Follow this question

By Email:

Once you sign in you will be able to subscribe for any updates here



Answers and Comments

Markdown Basics

  • *italic* or _italic_
  • **bold** or __bold__
  • link:[text](http://url.com/ "title")
  • image?![alt text](/path/img.jpg "title")
  • numbered list: 1. Foo 2. Bar
  • to add a line break simply add two spaces to where you would like the new line to be.
  • basic HTML tags are also supported



Asked: 22 Mar, 17:16

Seen: 801 times

Last updated: 12 Jul, 08:09

powered by OSQA