infineac.file_loader.load_files_from_xml#

infineac.file_loader.load_files_from_xml(files: list) list[dict][source]#

Parses the xml files and extracts the information from the earnings calls.

Parameters:

files (list) – List of xml files, to be parsed.

Returns:

List of dictionaries containing the extracted information from the earnings calls. For each file the information is extracted into a dictionary, that contains the following key-value pairs:

  • ’file’: str - the file name

  • ’year_upload’: integer - the year of the upload

  • ’corp_participants’: list[dict] - the corporate participants

  • ’corp_participants_collapsed’: list[str] - collapsed list

  • ’conf_participants’: list[dict] - the conference call participants

  • ’conf_participants_collapsed’: list[str] - collapsed list

  • ’presentation’: list[dict] - the presentation part

  • ’presentation_collapsed’: list[str] - collapsed list

  • ’qa’: list[dict] - the Q&A part

  • ’qa_collapsed’: list[str] - collapsed list

  • ’action’: str - the action (e.g. publish)

  • ’story_type’: str - the story type (e.g. transcript)

  • ’version’: str - the version of the publication (e.g. final)

  • ’title’: str - the title of the earnings call

  • ’city’: str - the city of the earnings call

  • ’company_name’: str - the company of the earnings call

  • ’company_ticker’: str - the company ticker of the earnings call

  • ’date’: date - the date of the earnings call

  • ’id’: int - the id of the publication

  • ’last_update’: date - the last update of the publication

  • ’event_type_id’: int - the event type id

  • ’event_type_name’: str - the event type name

Return type:

list[dict]