infineac.file_loader.load_files_from_xml#

infineac.file_loader.load_files_from_xml(files: list) → list[dict][source]#

Parses the xml files and extracts the information from the earnings calls.

Parameters:

files (list) – List of xml files, to be parsed.

Returns:

List of dictionaries containing the extracted information from the earnings calls. For each file the information is extracted into a dictionary, that contains the following key-value pairs:

’file’: str - the file name

’year_upload’: integer - the year of the upload

’corp_participants’: list[dict] - the corporate participants

’corp_participants_collapsed’: list[str] - collapsed list

’conf_participants’: list[dict] - the conference call participants

’conf_participants_collapsed’: list[str] - collapsed list

’presentation’: list[dict] - the presentation part

’presentation_collapsed’: list[str] - collapsed list

’qa’: list[dict] - the Q&A part

’qa_collapsed’: list[str] - collapsed list

’action’: str - the action (e.g. publish)

’story_type’: str - the story type (e.g. transcript)

’version’: str - the version of the publication (e.g. final)

’title’: str - the title of the earnings call

’city’: str - the city of the earnings call

’company_name’: str - the company of the earnings call

’company_ticker’: str - the company ticker of the earnings call

’date’: date - the date of the earnings call

’id’: int - the id of the publication

’last_update’: date - the last update of the publication

’event_type_id’: int - the event type id

’event_type_name’: str - the event type name

Return type:

list[dict]