Practice the pronunciation of data serialisation format names including Avro, Parquet, Protobuf, Arrow, and CBOR.
0 / 5 completed
1 / 5
How is 'Avro' pronounced?
Avro is pronounced /ˈeɪvroʊ/ — 'AY-vroh'. The name comes from Avro, the British aircraft manufacturer A.V. Roe and Company, continuing the Hadoop ecosystem's aviation naming. 'Av' = /eɪv/ (diphthong /eɪ/ as in 'rave', 'cave'). 'ro' = /roʊ/ (diphthong /oʊ/). Two syllables: AY-vroh, stress on the first. Non-native speakers often use a short /æ/ ('AV-roh') treating the first vowel as in 'avocado'. Apache Avro is a row-based data serialisation framework with schema-embedded data, widely used in Kafka messaging and Hadoop pipelines: 'Events are serialised as AY-vroh schemas in Kafka'.
2 / 5
How is 'Parquet' pronounced?
Parquet is pronounced /pɑːrˈkeɪ/ — 'par-KAY'. The name is the French word for wooden flooring (parquetry), borrowed with French pronunciation — silent final 't', stress on the second syllable. 'Par' = /pɑːr/ (rhotic vowel). '-quet' = /keɪ/ (as in 'bouquet', 'croquet'). Two syllables: par-KAY, stress on the second. Non-native speakers often say 'PAR-kit' (anglicising the ending) or 'PAR-ket'. Apache Parquet is a columnar storage format for the Hadoop ecosystem with efficient compression and encoding, ideal for analytical queries over large datasets: 'Store the analytics data in par-KAY format for faster queries'.
3 / 5
How is 'Protobuf' pronounced?
Protobuf is pronounced /ˈproʊtəbʌf/ — 'PROH-tuh-buf'. Short for 'Protocol Buffers'. 'Proto' = /ˈproʊtə/ — 'PROH-tuh' (diphthong /oʊ/ + schwa). 'buf' = /bʌf/ (short /ʌ/ as in 'buff'). Three syllables: PROH-tuh-buf, stress on the first. Non-native speakers may say 'PROT-uh-buf' (short /ɒ/) or 'PROH-tuh-boof' (long 'oo'). Protocol Buffers (Protobuf) is Google's binary serialisation format requiring a .proto schema and generating multi-language code, used with gRPC and for efficient data storage: 'Serialise the message with PROH-tuh-buf before sending over gRPC'.
4 / 5
How is 'Arrow' pronounced in a data engineering context?
Arrow is pronounced /ˈæroʊ/ — 'AR-oh'. The name is the familiar English word for a pointed projectile, evoking directional, fast data movement. 'Ar' = /ær/ (short /æ/ as in 'have'). '-row' = /roʊ/ (diphthong /oʊ/). Two syllables: AR-oh, stress on the first. Non-native speakers may use a long /eɪ/ ('AY-roh') treating the 'a' as the letter name. Apache Arrow is an in-memory columnar data format enabling zero-copy reads and interoperability between pandas, Spark, and Polars without serialisation overhead: 'Transfer data between Python and Rust using AR-oh format'.
5 / 5
How is 'CBOR' pronounced?
CBOR (Concise Binary Object Representation) is pronounced /ˈsiːbɔːr/ — 'SEE-bor'. The acronym is spoken as a word. 'C' = /siː/ (soft 'c', the letter sound). 'BOR' = /bɔːr/ (the /ɔː/ vowel as in 'bore', 'more'). Two syllables: SEE-bor, stress on the first. Non-native speakers may spell it out ('SEE BEE OH AR') or use a hard /k/. CBOR is a binary data serialisation format based on JSON's data model but significantly more compact, standardised in RFC 7049, used in IoT protocols (CoAP) and Web Authentication (WebAuthn): 'The IoT device sends sensor data encoded as SEE-bor'.