Jump to navigation Jump to search
resMBS is a graph / dataset that has been extracted from the contents of financial prospecti for US residential mortgage backed securities filed with the SEC. These securities were first created in 2002. They reached a peak in 2006 and then started to decline in 2007 and came to an abrupt end in 2008. We extracted the "financial supply chain" comprising "financial institutions" (FI) and the role (Role) that they play on a financial contract (FC).
The following paper provides an overview of how the dataset was created and some preliminary clustering analysis on the graph. resMBS: Constructing a Financial Supply Chain from Prospecti Doug Burdick, IBM Soham De and Louiqa Raschid and Mingchao Shao and Zheng Xu and Elena Zotkina, University of Maryland 
The networks described in the paper can be viewed here. FI clusters  FI clusters based on FC-FC similarity  FI-FC bipartite graph 
We used a topic modeling approach to develop a model FI-Comm where a topic is defined over a vocabulary of FIs and a model Role-FI-Comm where a topic is defined over a vocabulary of Role-FI pairs. Probabilistic Financial Community Models with Latent Dirichlet Allocation for Financial Supply Chains Zheng Xu and Louiqa Raschid, University of Maryland 
If you want the gory details of the tools on the IBM System T platform that were developed ... Exploiting Lists of Names for Named Entity Identification of Financial Institutions from Unstructured Documents Zheng Xu (University of Maryland) and Douglas Burdick (IBM) and Louiqa Raschid (University of Maryland) 
The dataset is available for research.