If you define a column as array and the JSON document has a list of numbers, that works too, as long as type coercion works there is usually no problem. What’s good to know is that Athena is fairly forgiving when it comes to describing complex types. The official Athena documentation makes a fairly good job at describing how to create tables for JSON data. Most of the properties are usually scalar, with one or two lists of either strings or fairly simple objects with one or two properties each. In my experience, most JSON data isn’t very hierarchical. Being able to describe most JSON data in table form is one of the most powerful features of Athena. If you are using Athena to query JSON data you have most likely already worked with complex types in your data in the form of an array property or an object property. Complex types can have arbitrarily complex structure, for example array,tags:map> could be a way to describe a list of article metadata. Complex types are types that refer to other types – or more concretely in the case of Athena, array (lists of elements), map (key/value associations), and struct (key/value associations with a fixed schema).Ĭomplex types can be for example array (an array of strings), map (a map with string keys and boolean values), or struct (a record with a string property called “name” and an integer property called “age”). Simple types, or “scalars”, are things like number, boolean, string, and timestamp. ![]() In this article I’m going to cover how to use complex types in three different contexts: how to create tables when you have complex types in data, how to work with complex types in queries, and how to deal with complex types in results.įirst, let’s define complex types so that we’re on the same page. With Athena this is almost never an issue. The data I worked with almost always contained lists of strings and other similarly “simple” complex types. When you try to describe the world you often end up with lists of things, the one-or-more type of relationship is very common, and things like lists of tags, or key/value pairs of metadata is more or less standard.īefore Athena I worked a lot with Redshift, and was often frustrated with the lack of complex types. ![]() Developers like data formats like JSON because they allow for expressing things like one-to-many relationships naturally and in a self-contained manner. The relational model wasn’t made with complex types in mind, but modern data is very rarely flat. It even supports lists and maps in CSV files, if you really want it to. ![]() Most of the data formats that Athena supports have support for complex types in the form of lists and maps.
0 Comments
Leave a Reply. |
AuthorWrite something about yourself. No need to be fancy, just an overview. ArchivesCategories |