Basic types
Constants
-
const std::size_t orcus::INDEX_NOT_FOUND
Generic constant to be used to indicate that a valid index value is expected but not found.
-
const xmlns_id_t orcus::XMLNS_UNKNOWN_ID
Value associated with an unknown XML namespace.
-
const xml_token_t orcus::XML_UNKNOWN_TOKEN
Value associated with an unknown XML token.
Type aliases
-
using orcus::xml_token_attrs_t = std::vector<xml_token_attr_t>
Structs
-
struct date_time_t
Struct that holds a date or date-time value.
Public Functions
-
date_time_t()
-
date_time_t(int _year, int _month, int _day)
-
date_time_t(int _year, int _month, int _day, int _hour, int _minute, double _second)
-
date_time_t(const date_time_t &other)
-
~date_time_t()
-
date_time_t &operator=(date_time_t other)
-
bool operator==(const date_time_t &other) const
-
bool operator!=(const date_time_t &other) const
-
bool operator<(const date_time_t &other) const
-
std::string to_string() const
Convert the date-time value to an ISO-formatted string representation.
- Returns:
ISO-formatted string representation of the date-time value.
-
void swap(date_time_t &other)
Swap the value with another instance.
- Parameters:
other – another instance to swap values with.
Public Static Functions
-
static date_time_t from_chars(std::string_view str)
Parse an ISO-formatted string representation of a date-time value, and convert it into a date_time_t value. A string representation allows either a date only or a date and time value, but it does not allow a time only value.
Here are some examples of ISO-formatted date and date-time values:
2013-04-09 (date only)
2013-04-09T21:34:09.55 (date and time)
- Parameters:
str – string representation of a date-time value.
- Returns:
converted date-time value consisting of a set of numeric values.
-
date_time_t()
-
struct length_t
Holds a length value with unit of measurement.
Public Functions
-
length_t()
-
length_t(length_unit_t _unit, double _value)
-
std::string to_string() const
-
length_t()
-
struct parse_error_value_t
Parser token that represents the state of a parse error, used by threaded_json_parser and threaded_sax_token_parser when transferring parse status between threads.
Public Functions
-
parse_error_value_t()
-
parse_error_value_t(const parse_error_value_t &other)
-
parse_error_value_t(std::string_view _str, std::ptrdiff_t _offset)
-
parse_error_value_t &operator=(const parse_error_value_t &other)
-
bool operator==(const parse_error_value_t &other) const
-
bool operator!=(const parse_error_value_t &other) const
-
parse_error_value_t()
-
struct xml_declaration_t
Struct holding XML declaration properties.
Public Functions
-
xml_declaration_t()
-
xml_declaration_t(uint8_t _version_major, uint8_t _version_minor, character_set_t _encoding, bool _standalone)
-
xml_declaration_t(const xml_declaration_t &other)
-
~xml_declaration_t()
-
xml_declaration_t &operator=(const xml_declaration_t &other)
-
bool operator==(const xml_declaration_t &other) const
-
bool operator!=(const xml_declaration_t &other) const
Public Members
-
uint8_t version_major
-
uint8_t version_minor
-
character_set_t encoding
-
bool standalone
-
xml_declaration_t()
-
struct xml_name_t
Represents a name with a normalized namespace in XML documents. This can be used either as an element name or as an attribute name.
Public Functions
-
xml_name_t() noexcept
-
xml_name_t(xmlns_id_t _ns, std::string_view _name)
-
xml_name_t(const xml_name_t &other)
-
xml_name_t &operator=(const xml_name_t &other)
-
bool operator==(const xml_name_t &other) const noexcept
-
bool operator!=(const xml_name_t &other) const noexcept
-
std::string to_string(const xmlns_context &cxt, to_string_type type) const
Convert a namespace-name value pair to a string representation with the namespace value converted to either an alias or a unique “short name”. Refer to get_alias() and get_short_name() for the explanations of an alias and short name.
- Parameters:
cxt – namespace context object associated with the XML stream currently being parsed.
type – policy on how to convert a namespace identifier to a string representation.
- Returns:
string representation of a namespace-name value pair.
-
std::string to_string(const xmlns_repository &repo) const
Convert a namespace-name value pair to a string representation with the namespace value converted to a unique “short name”. Refer to get_short_name() for the explanations of a short name.
- Parameters:
repo – namespace repository.
- Returns:
string representation of a namespace-name value pair.
-
xml_name_t() noexcept
-
struct xml_token_attr_t
Struct containing properties of a tokenized XML attribute.
Public Functions
-
xml_token_attr_t()
-
xml_token_attr_t(const xml_token_attr_t &other)
-
xml_token_attr_t(xmlns_id_t _ns, xml_token_t _name, std::string_view _value, bool _transient)
-
xml_token_attr_t(xmlns_id_t _ns, xml_token_t _name, std::string_view _raw_name, std::string_view _value, bool _transient)
-
xml_token_attr_t &operator=(const xml_token_attr_t &other)
Public Members
-
xmlns_id_t ns
-
xml_token_t name
-
std::string_view raw_name
-
std::string_view value
-
bool transient
Whether or not the attribute value is transient. A transient value is only guaranteed to be valid until the end of the start_element call, after which its validity is not guaranteed. A non-transient value is guaranteed to be valid during the life cycle of the xml stream it belongs to.
-
xml_token_attr_t()
-
struct xml_token_element_t
Struct containing XML element properties passed to the handler of sax_token_parser via its
start_element()
andend_element()
calls.Public Functions
-
xml_token_element_t &operator=(xml_token_element_t) = delete
-
xml_token_element_t()
-
xml_token_element_t(xmlns_id_t _ns, xml_token_t _name, std::string_view _raw_name, std::vector<xml_token_attr_t> &&_attrs)
-
xml_token_element_t(const xml_token_element_t &other)
-
xml_token_element_t(xml_token_element_t &&other)
-
xml_token_element_t &operator=(xml_token_element_t) = delete
Enums
-
enum class orcus::character_set_t
Character set types, generated from IANA character-sets specifications.
Values:
-
enumerator unspecified
-
enumerator adobe_standard_encoding
-
enumerator adobe_symbol_encoding
-
enumerator amiga_1251
-
enumerator ansi_x3_110_1983
-
enumerator asmo_449
-
enumerator big5
-
enumerator big5_hkscs
-
enumerator bocu_1
-
enumerator brf
-
enumerator bs_4730
-
enumerator bs_viewdata
-
enumerator cesu_8
-
enumerator cp50220
-
enumerator cp51932
-
enumerator csa_z243_4_1985_1
-
enumerator csa_z243_4_1985_2
-
enumerator csa_z243_4_1985_gr
-
enumerator csn_369103
-
enumerator dec_mcs
-
enumerator din_66003
-
enumerator dk_us
-
enumerator ds_2089
-
enumerator ebcdic_at_de
-
enumerator ebcdic_at_de_a
-
enumerator ebcdic_ca_fr
-
enumerator ebcdic_dk_no
-
enumerator ebcdic_dk_no_a
-
enumerator ebcdic_es
-
enumerator ebcdic_es_a
-
enumerator ebcdic_es_s
-
enumerator ebcdic_fi_se
-
enumerator ebcdic_fi_se_a
-
enumerator ebcdic_fr
-
enumerator ebcdic_it
-
enumerator ebcdic_pt
-
enumerator ebcdic_uk
-
enumerator ebcdic_us
-
enumerator ecma_cyrillic
-
enumerator es
-
enumerator es2
-
enumerator euc_jp
-
enumerator euc_kr
-
enumerator extended_unix_code_fixed_width_for_japanese
-
enumerator gb18030
-
enumerator gb2312
-
enumerator gb_1988_80
-
enumerator gb_2312_80
-
enumerator gbk
-
enumerator gost_19768_74
-
enumerator greek7
-
enumerator greek7_old
-
enumerator greek_ccitt
-
enumerator hp_desktop
-
enumerator hp_legal
-
enumerator hp_math8
-
enumerator hp_pi_font
-
enumerator hp_roman8
-
enumerator hz_gb_2312
-
enumerator ibm00858
-
enumerator ibm00924
-
enumerator ibm01140
-
enumerator ibm01141
-
enumerator ibm01142
-
enumerator ibm01143
-
enumerator ibm01144
-
enumerator ibm01145
-
enumerator ibm01146
-
enumerator ibm01147
-
enumerator ibm01148
-
enumerator ibm01149
-
enumerator ibm037
-
enumerator ibm038
-
enumerator ibm1026
-
enumerator ibm1047
-
enumerator ibm273
-
enumerator ibm274
-
enumerator ibm275
-
enumerator ibm277
-
enumerator ibm278
-
enumerator ibm280
-
enumerator ibm281
-
enumerator ibm284
-
enumerator ibm285
-
enumerator ibm290
-
enumerator ibm297
-
enumerator ibm420
-
enumerator ibm423
-
enumerator ibm424
-
enumerator ibm437
-
enumerator ibm500
-
enumerator ibm775
-
enumerator ibm850
-
enumerator ibm851
-
enumerator ibm852
-
enumerator ibm855
-
enumerator ibm857
-
enumerator ibm860
-
enumerator ibm861
-
enumerator ibm862
-
enumerator ibm863
-
enumerator ibm864
-
enumerator ibm865
-
enumerator ibm866
-
enumerator ibm868
-
enumerator ibm869
-
enumerator ibm870
-
enumerator ibm871
-
enumerator ibm880
-
enumerator ibm891
-
enumerator ibm903
-
enumerator ibm904
-
enumerator ibm905
-
enumerator ibm918
-
enumerator ibm_symbols
-
enumerator ibm_thai
-
enumerator iec_p27_1
-
enumerator inis
-
enumerator inis_8
-
enumerator inis_cyrillic
-
enumerator invariant
-
enumerator iso_10367_box
-
enumerator iso_10646_j_1
-
enumerator iso_10646_ucs_2
-
enumerator iso_10646_ucs_4
-
enumerator iso_10646_ucs_basic
-
enumerator iso_10646_unicode_latin1
-
enumerator iso_10646_utf_1
-
enumerator iso_11548_1
-
enumerator iso_2022_cn
-
enumerator iso_2022_cn_ext
-
enumerator iso_2022_jp
-
enumerator iso_2022_jp_2
-
enumerator iso_2022_kr
-
enumerator iso_2033_1983
-
enumerator iso_5427
-
enumerator iso_5427_1981
-
enumerator iso_5428_1980
-
enumerator iso_646_basic_1983
-
enumerator iso_646_irv_1983
-
enumerator iso_6937_2_25
-
enumerator iso_6937_2_add
-
enumerator iso_8859_1
-
enumerator iso_8859_10
-
enumerator iso_8859_13
-
enumerator iso_8859_14
-
enumerator iso_8859_15
-
enumerator iso_8859_16
-
enumerator iso_8859_1_windows_3_0_latin_1
-
enumerator iso_8859_1_windows_3_1_latin_1
-
enumerator iso_8859_2
-
enumerator iso_8859_2_windows_latin_2
-
enumerator iso_8859_3
-
enumerator iso_8859_4
-
enumerator iso_8859_5
-
enumerator iso_8859_6
-
enumerator iso_8859_6_e
-
enumerator iso_8859_6_i
-
enumerator iso_8859_7
-
enumerator iso_8859_8
-
enumerator iso_8859_8_e
-
enumerator iso_8859_8_i
-
enumerator iso_8859_9
-
enumerator iso_8859_9_windows_latin_5
-
enumerator iso_8859_supp
-
enumerator iso_ir_90
-
enumerator iso_unicode_ibm_1261
-
enumerator iso_unicode_ibm_1264
-
enumerator iso_unicode_ibm_1265
-
enumerator iso_unicode_ibm_1268
-
enumerator iso_unicode_ibm_1276
-
enumerator it
-
enumerator jis_c6220_1969_jp
-
enumerator jis_c6220_1969_ro
-
enumerator jis_c6226_1978
-
enumerator jis_c6226_1983
-
enumerator jis_c6229_1984_a
-
enumerator jis_c6229_1984_b
-
enumerator jis_c6229_1984_b_add
-
enumerator jis_c6229_1984_hand
-
enumerator jis_c6229_1984_hand_add
-
enumerator jis_c6229_1984_kana
-
enumerator jis_encoding
-
enumerator jis_x0201
-
enumerator jis_x0212_1990
-
enumerator jus_i_b1_002
-
enumerator jus_i_b1_003_mac
-
enumerator jus_i_b1_003_serb
-
enumerator koi7_switched
-
enumerator koi8_r
-
enumerator koi8_u
-
enumerator ks_c_5601_1987
-
enumerator ksc5636
-
enumerator kz_1048
-
enumerator latin_greek
-
enumerator latin_greek_1
-
enumerator latin_lap
-
enumerator macintosh
-
enumerator microsoft_publishing
-
enumerator mnem
-
enumerator mnemonic
-
enumerator msz_7795_3
-
enumerator nats_dano
-
enumerator nats_dano_add
-
enumerator nats_sefi
-
enumerator nats_sefi_add
-
enumerator nc_nc00_10_81
-
enumerator nf_z_62_010
-
enumerator nf_z_62_010_1973
-
enumerator ns_4551_1
-
enumerator ns_4551_2
-
enumerator osd_ebcdic_df03_irv
-
enumerator osd_ebcdic_df04_1
-
enumerator osd_ebcdic_df04_15
-
enumerator pc8_danish_norwegian
-
enumerator pc8_turkish
-
enumerator pt
-
enumerator pt2
-
enumerator ptcp154
-
enumerator scsu
-
enumerator sen_850200_b
-
enumerator sen_850200_c
-
enumerator shift_jis
-
enumerator t_101_g2
-
enumerator t_61_7bit
-
enumerator t_61_8bit
-
enumerator tis_620
-
enumerator tscii
-
enumerator unicode_1_1
-
enumerator unicode_1_1_utf_7
-
enumerator unknown_8bit
-
enumerator us_ascii
-
enumerator us_dk
-
enumerator utf_16
-
enumerator utf_16be
-
enumerator utf_16le
-
enumerator utf_32
-
enumerator utf_32be
-
enumerator utf_32le
-
enumerator utf_7
-
enumerator utf_7_imap
-
enumerator utf_8
-
enumerator ventura_international
-
enumerator ventura_math
-
enumerator ventura_us
-
enumerator videotex_suppl
-
enumerator viqr
-
enumerator viscii
-
enumerator windows_1250
-
enumerator windows_1251
-
enumerator windows_1252
-
enumerator windows_1253
-
enumerator windows_1254
-
enumerator windows_1255
-
enumerator windows_1256
-
enumerator windows_1257
-
enumerator windows_1258
-
enumerator windows_31j
-
enumerator windows_874
-
enumerator unspecified
-
enum class orcus::dump_format_t
Formats supported by orcus as output formats.
Values:
-
enumerator unknown
-
enumerator none
-
enumerator check
-
enumerator csv
-
enumerator flat
-
enumerator html
-
enumerator json
-
enumerator xml
-
enumerator yaml
-
enumerator debug_state
-
enumerator unknown
-
enum class orcus::format_t
Input formats that orcus can import.
Values:
-
enumerator unknown
-
enumerator ods
-
enumerator xlsx
-
enumerator gnumeric
-
enumerator xls_xml
-
enumerator csv
-
enumerator parquet
-
enumerator unknown
-
enum class orcus::length_unit_t
Unit of length, as used in length_t.
Values:
-
enumerator unknown
-
enumerator centimeter
-
enumerator millimeter
-
enumerator xlsx_column_digit
Special unit of length used by Excel, defined as the maximum digit width of font used as the “Normal” style font.
Note
Since it’s not possible to determine the actual length using this unit, it is approximated by 1.9 millimeters.
-
enumerator inch
-
enumerator point
-
enumerator twip
One twip is a twentieth of a point equal to 1/1440 of an inch.
-
enumerator pixel
-
enumerator unknown
Utility functions
-
std::vector<std::pair<std::string_view, dump_format_t>> orcus::get_dump_format_entries()
Get a list of available output format entries. Each entry consists of the name of a format and its enum value equivalent.
- Returns:
list of available output format entries.
-
character_set_t orcus::to_character_set(std::string_view s)
Parse a string that represents a character set and convert it to a corresponding enum value.
- Parameters:
s – string representing a character set.
- Returns:
enum value representing a character set, or character_set_t::unspecified in case it cannot be determined.
-
dump_format_t orcus::to_dump_format_enum(std::string_view s)
Parse a string that represents an output format type and convert it to a corresponding enum value.
- Parameters:
s – string representing an output format type.
- Returns:
enum value representing a character set, or character_set_t::unknown in case it cannot be determined.