Data Dictionaries

Embed our data directly into your workflow.

Coverage

  • ~2900 currently active stocks.
  • ~2700 currently inactive stocks that appear in the historical data
  • Since 2004, we have focused our coverage on currently live, actively-traded stocks because our primary clients have been fundamental PMs. We've always aimed to maintain coverage of 2900 - 3000 active stocks. Due to bankruptcies, acquisitions, delistings, etc., we lose about 20 companies per month from coverage. When adding companies to coverage, we prioritize companies with higher market caps and trading volumes, along with big IPOs and client requests.

Coverage information is updated daily and available here.

Time Frame

Annual data from 10-Ks (and other annual filings) is available from 1998 to the present. Quarterly data from 10-Qs (and other quarterly filings) is available from 2012 to the present. All data is presented as trailing-twelve-months (TTM) data (more details on TTM are below).

Source

We source all data directly from the annual and quarterly SEC filings using our proprietary Robo Analyst technology. All calculations are our own.

Point-in-Time Data

All data is provided as of the dates presented in the dataset file. Only data available as of the as_of_date is used in our models. More information on data dates below.

Trailing-Twelve-Month (TTM) Data

All data is for the trailing twelve months. Annual data is shown when a company's most recent filing is an annual filing (10-K). Trailing-twelve-month data from the prior 4 quarters is shown when a company's most recent filing is a quarterly filing (10-Q).

How Amended Filings (10 K/As and 10 Q/As) Affect Datasets

Currently, our backtest datasets present data from only one filing per annual or quarterly period. When a company updates its original filing with an amended filing, we present the data from only the amended filing in the backtest dataset. We only present data from an amended filing when it provides materially different financial data from the original filing. We do not collect data from amended filings unless they provide materially different financial data.

All data remains point-in-time data because data from amended filings does not appear in the datasets until the filing date of the amended filing. Leaving out the data from the original filing is an unintended result of the original structure of our database and models; both were built with a preference for the most up-to-date amended data for historical periods.

We're working to update our systems to include the data from the original filings as well as amended filings in our backtest datasets.

This issue causes us to exclude data from 1,642 original filings (~1% of total filings), because an amended filing is used instead of the original filing. On average, the amended data gets filed/parsed 184 days after the original filing date. As a result, the gap between datapoints in our backtest datasets, when an amended filing is used, can reach 549 days (1yr + 184 days) on average.

Dataset Generation Date

Included in the file name for the dataset, the dataset generation date is the date on which the backtest dataset was generated.

Frequency

Datapoints impacted by market price changes are updated weekly. Financial data not impacted by market price changes are updated on the filing date.

Definitions

  • ticker - The ticker for the security on the file generation date. Tickers that include a colon are currently inactive stocks. They are no longer traded because they were acquired, went bankrupt, etc. We assign the last used ticker to the security followed by a colon and a number that increments for each new company that becomes inactive with that ticker. For example, XYZ Corp uses ticker XYZ and goes inactive. We assign the company the ticker XYZ:1 because it is the first company in our system to go inactive using ticker XYZ. If a different company, XYZ Technology, starts using ticker XYZ and goes inactive, it will be assigned XYZ:2. A list of tickers and company names is available on our website or through the coverage endpoint of our API.
  • company_name - The name of the company on the file generation date.
  • cik - The Central Index Key (CIK) used by the SEC to identify corporations and individuals who have filed with the SEC. We do not provide CUSIPs or other industry identifiers for securities. CIK is provided to help map securities from New Constructs to other data sets. For active companies, the CIK is the one in use by the SEC on the data generation date. For inactive companies, the CIK is the last one in use by the company prior to its being inactivated.
  • figi - The Financial Instrument Global Identifier (FIGI) is an established global standard issued under the guidelines of the Object Management Group (OMG.org, an international, non-profit standards organization), founded in 1989. FIGI is provided to help map securities from New Constructs to other data sets. Please see OpenFIGI for details on OpenFIGI and its use.
  • stock_exchange - The exchange on which a ticker trades. For active stocks, the exchange is the one on which the ticker was traded on the data generation date. For inactive stocks, the exchange is the last one on which the ticker was traded prior to its being inactivated.
  • company_status_current - The trading status of the security on the Data Generation Date. Actively traded stocks are marked as 'live'. Inactive stocks that have been delisted and no longer traded are marked as 'inactive'.
  • fiscal_year - The fiscal year of the most recent filing data used on the as_of_date.
  • fiscal_quarter - The fiscal quarter of the most recent filing data on the as_of_date. If the most recent filing is an annual filing, this field will be null. If the most recent filing is a quarterly filing, this field will show the quarter: 1, 2, or 3, indicating the data belongs to a trailing-twelve month (TTM) model.
  • filing_type - The filing type of the most recent filing data on the as_of_date - generally a 10-K or 10-Q, though other filing types are also used.
  • filing_date - The SEC filing date for the most recent filing on the as_of_date. Data is generally available to clients within 12 to 48 hours after a new filing is filed with the SEC.
  • period_end_date - The fiscal period end date of the most recent filing on the as_of_date.
  • as_of_date - The point-in-time date applicable to the data presented. Data that are affected by stock price use closing stock prices on the as_of_date. Data that are not impacted by stock price are updated on the filing_date. Only filing data available on the as_of_date is used to calculate our derived data. No future data is used.
  • Data columns - We provide descriptions of the data for each column included in each dataset in the documentation for each dataset. All data values are reported in ones units. Datapoints that are impacted by stock price changes are marked with an asterisk (*) in the documentation.