Difference between revisions of "DSMM: Data Science for Macro-Modeling with Financial and Economic Datasets"

(DSfin Financial Entity Resolution At Scale Challenge)
 
(166 intermediate revisions by the same user not shown)
Line 1: Line 1:
  
== DSfin Financial Entity Resolution At Scale Challenge ==
+
[[File:Sponsor_images2.png]] 
 +
===Overview===
  
'''Motivation'''
+
DSMM 2019 will explore the challenges of macro-modeling with financial and economic datasets. The workshop will also showcase the '''Financial Entity Identification and Information Integration (FEIII) Challenge''' and will involve a challenge task over small business data.
 +
[https://ir.nist.gov/feiii/]
 +
Financial big data and FINTECH applications are in the vanguard of activities around the deployment of Open Knowledge Networks.
 +
[http://ichs.ucsf.edu/open-knowledge-network/]
  
Develop a reference financial-entity identifier knowledge-base linking heterogeneous
+
Past Proceedings are available here: [https://dl.acm.org/citation.cfm?id=3336499]
collections of entity identifiers.
+
[https://dl.acm.org/citation.cfm?id=3220547] [https://dl.acm.org/citation.cfm?id=3077240]
 +
[https://dl.acm.org/citation.cfm?id=2951894] [https://dl.acm.org/citation.cfm?id=2630729].
  
  Specify challenges, techniques and share expertise for linking multiple open collections.
+
The advent of Big Data infrastructures and analytical tools can support the required information fusion, as well as macro-modeling with diverse datasets, and can potentially lead to the exploration of complex financial and economic ecosystems. Although integrating datasets may pose technical and policy/privacy challenges, the potential benefits are immense. The resulting enriched datasets could explore hypotheses with a different focus or level of granularity.
 +
For example, social media data often contains features that could enhance macroeconomic statistics. The resulting enriched datasets could explore hypotheses with a different focus or level of granularity, e.g., the ability to model small to medium enterprises (SMEs).  
  
Financial entity resolution (record linkage) is first step towards the extraction and  
+
The financial world is a closely interlinked Web of financial entities and networks, supply chains and ecosystems. Analysts, regulators and researchers must address the challenges of monitoring, integrating, and analyzing at scale.  Technical challenges include entity identification; entity classification; learning relationships among entities. The benefits of addressing these challenges may result in improved tools for
  integration of complex financial knowledge across heterogeneous sources.
+
regulators to monitor financial systems or to set fiscal policy. Additional benefits may include fundamentally new designs of market mechanisms, new ways to reach consumers and to exploit the wisdom of the crowds.
  
== DSMM 2014 Schedule ==
+
We expect attendees with an interest in information integration, data mining, knowledge representation, network and visual analytics, stream data processing, etc. to participate.
 
 
'''8:30 a.m. to 10 a.m. - WELCOME and KEYNOTE and Opening Session on Financial Analytics'''
 
Papers in [[Session1]]
 
 
 
'''10:30 a.m. to 12 noon - Financial Data Integration Tools and Methods and POSTER SLAM'''
 
Papers in [[Session 2]]
 
 
'''12 noon to 1:30 p.m. - LUNCH in the Summit Room''' [[list-of-posters]]
 
Enjoy the view and the posters!
 
 
'''1:30 p.m. to 3 p.m. - Financial Networks and Games and Regulatory Data'''
 
Papers in [[Session 3]]
 
 
 
'''3:30 p.m. to 5 p.m. - DSfin Financial Entity Resolution At Scale CHALLENGE and WRAP-UP'''
 
[[Session 4]]
 
 
 
== Accepted Papers ==
 
[[list-of-papers]]
 
 
 
== Accepted Posters ==
 
[[list-of-posters]]
 
 
 
 
 
 
 
==Overview==
 
 
 
===Focus of the Workshop===
 
The increasing availability of Open Data from a variety of sources including the Web, social media and the government, in conjunction with the growth of Big Data infrastructures and analytics tools, provides the ability to model complex ecosystems enabling cyber-human decision making. While data-driven models have emerged for a range of challenges from climate modeling to systems biology to personalized medicine, there has been relatively, little activity in macro-modeling using multiple heterogeneous financial and economic datasets.
 
 
 
The real promise of Open Data and Big Data lies in the dramatically increased value gained from integrating data from multiple sources, as illustrated by the following example: The systemic risks associated with the subprime lending market and the crash of the housing market in 2007 could have been modeled through a comprehensive integration and analysis of available public datasets. For example, the datasets relevant to the home mortgage supply chain include the following: (a) regulatory documents made available by MBS issuers, publicly traded financial institutions and mutual funds; (b) subscription-based third party datasets on underlying mortgages; (c) individual home transaction data such as sales, foreclosure and tax records; (d) local economic data such as employment and income-levels; (e) financial news articles. Integrating these datasets may have provided financial analysts, regulators and academic researchers, with comprehensive models to enable risk assessment.
 
 
 
Economists have been the leaders in creating longitudinal panel datasets and have had a successful history of using national datasets from the Census Bureau, the Department of Labor, etc., and global datasets from the UN, World Bank, etc. Here, too, there has been much less activity in modeling that integrated multiple heterogeneous datasets. While integrating datasets may pose technical, policy and privacy challenges, the potential benefits are immense.  For example, social media data often contains features that could enhance macroeconomic statistics derived from traditional survey-driven datasets.  Enriching longitudinal panel datasets with social media could explore hypotheses with a different focus or level of granularity; for example, one could study the decision making of individuals whose social media profiles would reflect their beliefs, intent, interests, sentiments, opinions, and state of mind. 
 
 
 
This workshop will explore the challenges of data science for macro-modeling with financial and/or economic datasets. Two workshops, in 2010 and 2012, brought together a diverse community of academic researchers, regulators and practitioners who articulated the range of multi-disciplinary research challenges for macro-prudential modeling of financial systemic risk. The National Bureau of Economic Research Summer Institute in 2012 offered a workshop on novel data-centric techniques that attracted both economists and computer scientists. The workshop will target attendees of these prior meetings and will build upon the solid foundation established at these prior events.
 
 
 
'''Targeted Audience''': We expect a mix of paper submissions and attendees with an interest in information integration, data mining, knowledge representation, stream data processing, etc. A small number of domain specialists from finance and economics are also expected to attend.
 
 
 
===Important Dates===
 
Submission deadline:    '''EXTENDED!!!''' Monday March 31, 2014. '''EXTENDED!!!'''
 
Notification to authors: Friday May 2, 2014.
 
Camera-ready due:        Friday May 23, 2014.
 
Registration deadline:
 
Workshop:                Friday June 27, 2014.
 
 
 
===Submission Format===
 
Authors are invited to submit original, unpublished research papers that are not being considered for publication in any other forum.
 
We will accept the following types of papers:
 
* Regular papers that are a maximum of 6 pages will have a presentation slot.
 
* Extended abstracts of up to 2 pages will have a poster presentation and a short presentation slot
 
    if time permits.
 
 
 
Manuscripts should be submitted electronically as PDF files and be formatted using the SIGMOD camera-ready templates [http://www.acm.org/sigs/publications/proceedings-templates templates].
 
Authors are allowed to include extra material beyond the six pages as a clearly marked appendix, which reviewers are not obliged to read.
 
 
 
'''Submission Site'''
 
https://cmt.research.microsoft.com/DSMM2014/
 
 
 
== Organization ==
 
 
 
=== Program Chairs ===
 
{|
 
|| ||
 
|-
 
|Rajasekar Krishnamurthy    ||  IBM Research||                        rajase@us.ibm.com
 
|-
 
|Louiqa Raschid||            University of Maryland||              louiqa@umiacs.umd.edu
 
|-
 
|Shiv Vaithyanathan||        IBM Research||                        vaithyan@us.ibm.com
 
|-
 
 
 
|}
 
 
 
=== Steering Committee ===
 
{|| ||
 
|-
 
|Lise Getoor||   University of California Santa Cruz|| getoor@soe.ucsc.edu
 
|-
 
|Laura Haas||   IBM Research|| lmhaas@us.ibm.com
 
|-
 
|H.V. Jagadish||   University of Michigan|| jag@umich.edu
 
|-
 
|}
 
 
 
=== Program Committee ===
 
{|
 
|-
 
|Richard Anderson||   Lindenwood University|| rganderson.stl@gmail.com
 
|-
 
|Michael Cafarella||   University of Michigan|| michjc@umich.edu
 
|-
 
|Sanjiv Das||   Santa Clara University|| srdas@scu.edu
 
|-
 
|Amol Deshpande||            University of Maryland|| amol@cs.umd.edu
 
|-
 
|Mark Flood||   Office of Financial Research||         mark.flood@treasury.gov
 
|-
 
|Juliana Freire||   New York University||         juliana.freire@nyu.edu
 
|-
 
|Gerard Hoberg||   University of Maryland|| ghoberg@rhsmith.umd.edu
 
|-
 
|Vasant Honavar||            Pennsylvania State University||        vhonavar@ist.psu.edu
 
|-
 
|Joe Langsam||   University of Maryland|| jlangsam@rhsmith.umd.edu
 
|-
 
|Shawn Mankad||              University of Maryland||              smankad@rhsmith.umd.edu     
 
|-
 
|Frank Olken||              National Science Foundation||          folken@nsf.gov
 
|-
 
|Felix Naumann||   Hasso Plattner Institute, Germany||    felix.naumann@hpi.uni-potsdam.de
 
|-
 
|Christopher Ré||   Stanford University||                 chrismre@cs.stanford.edu
 
|-
 
|| ||
 
|-
 
|'''Webmaster'''
 
|-
 
|Peratham Wiriyathammabhum||University of Maryland||peratham@cs.umd.edu
 
|}
 
 
 
== Getting started ==
 
* [//www.mediawiki.org/wiki/Manual:Configuration_settings Configuration settings list]
 
* [//www.mediawiki.org/wiki/Manual:FAQ MediaWiki FAQ]
 
* [https://lists.wikimedia.org/mailman/listinfo/mediawiki-announce MediaWiki release mailing list]
 

Latest revision as of 22:16, 10 July 2019

Sponsor images2.png

Overview

DSMM 2019 will explore the challenges of macro-modeling with financial and economic datasets. The workshop will also showcase the Financial Entity Identification and Information Integration (FEIII) Challenge and will involve a challenge task over small business data. [1] Financial big data and FINTECH applications are in the vanguard of activities around the deployment of Open Knowledge Networks. [2]

Past Proceedings are available here: [3] [4] [5] [6] [7].

The advent of Big Data infrastructures and analytical tools can support the required information fusion, as well as macro-modeling with diverse datasets, and can potentially lead to the exploration of complex financial and economic ecosystems. Although integrating datasets may pose technical and policy/privacy challenges, the potential benefits are immense. The resulting enriched datasets could explore hypotheses with a different focus or level of granularity. For example, social media data often contains features that could enhance macroeconomic statistics. The resulting enriched datasets could explore hypotheses with a different focus or level of granularity, e.g., the ability to model small to medium enterprises (SMEs).

The financial world is a closely interlinked Web of financial entities and networks, supply chains and ecosystems. Analysts, regulators and researchers must address the challenges of monitoring, integrating, and analyzing at scale. Technical challenges include entity identification; entity classification; learning relationships among entities. The benefits of addressing these challenges may result in improved tools for regulators to monitor financial systems or to set fiscal policy. Additional benefits may include fundamentally new designs of market mechanisms, new ways to reach consumers and to exploit the wisdom of the crowds.

We expect attendees with an interest in information integration, data mining, knowledge representation, network and visual analytics, stream data processing, etc. to participate.