{"id":3719,"date":"2022-02-02T12:58:53","date_gmt":"2022-02-02T12:58:53","guid":{"rendered":"https:\/\/wiki.thomasandsofia.com\/?p=3719"},"modified":"2022-02-08T02:55:59","modified_gmt":"2022-02-08T02:55:59","slug":"bqfbd-dataset-and-table-creation","status":"publish","type":"post","link":"https:\/\/wiki.thomasandsofia.com\/?p=3719","title":{"rendered":"BQFBD &#8211; Dataset and Table Creation"},"content":{"rendered":"<p><a href=\"\/bigquery-for-big-data\/\">Main Menu<\/a><\/p>\n<h1>Section 3: Dataset and Table Creation<\/h1>\n<h2>12. Setup a GCP account<\/h2>\n<p>https:\/\/www.udemy.com\/course\/bigquery\/learn\/lecture\/22719631#overview<\/p>\n<p>Steps to signup for a free Google cloud setup and get $300 for 12 months in your account<\/p>\n<ol>\n<li>Go to <a href=\"http:\/\/cloud.google.com\" target=\"_blank\" rel=\"noopener noreferrer\">cloud.google.com<\/a><\/li>\n<li>Click on &#8220;Get started for free&#8221;<\/li>\n<li>Enter your Gmail id and password.<\/li>\n<li>Accept the Terms of service and click Continue.<\/li>\n<li>Select account type as &#8216;Individual&#8217;, Tax registration as &#8216;Unregistered Individual&#8217; and other mandatory address details.<\/li>\n<li>Enter your credit\/debit card details. A very small amount (for ex \u20b91 or US $0.013) will be deducted from your bank card just to ensure its authenticity which will anyhow be refunded back to your card.<\/li>\n<\/ol>\n<h2>13: Note<\/h2>\n<p>https:\/\/www.udemy.com\/course\/bigquery\/learn\/lecture\/28810020#overview<\/p>\n<p>Once you open BigQuery, you may see a different UI for the initial screen. To get the relevant UI<\/p>\n<ol>\n<li>Click on &#8220;DISABLE EDITOR TABS&#8221;<\/li>\n<li>Click on &#8220;Opt Out&#8221;<\/li>\n<\/ol>\n<p>&nbsp;<\/p>\n<h2>14. Create a project<\/h2>\n<p>&nbsp;<\/p>\n<h2>15. BigQuery UI Tour<\/h2>\n<p>https:\/\/www.udemy.com\/course\/bigquery\/learn\/lecture\/22718867#overview<\/p>\n<ul>\n<li>Queries are a type of JOB\n<ul>\n<li>Can be saved<\/li>\n<\/ul>\n<\/li>\n<li>Jobs\n<ul>\n<li>Load, export, copy and query jobs<\/li>\n<li>saved for 6 months<\/li>\n<\/ul>\n<\/li>\n<li>Transfer\n<ul>\n<li>Import data<\/li>\n<\/ul>\n<\/li>\n<li>Scheduled Queries<\/li>\n<li>Reservations\n<ul>\n<li>no info<\/li>\n<\/ul>\n<\/li>\n<li>BI Engine\n<ul>\n<li>no info<\/li>\n<\/ul>\n<\/li>\n<li>Resources\n<ul>\n<li>Pinned projects<\/li>\n<li>Add Data\n<ul>\n<li>Public Data Sets for hands on&#8230;\n<ul>\n<li>https:\/\/console.cloud.google.com\/bigquery?project=bigquery-public-data&amp;page=project<\/li>\n<li>Pin the project<\/li>\n<li>Select your project from the top menu<\/li>\n<li>All set!<\/li>\n<\/ul>\n<\/li>\n<\/ul>\n<\/li>\n<\/ul>\n<\/li>\n<\/ul>\n<h2>16. Region vs Multi-region<\/h2>\n<p>https:\/\/www.udemy.com\/course\/bigquery\/learn\/lecture\/22718881#overview<\/p>\n<ul>\n<li>Data is not directly stored into tables.<\/li>\n<li>Dataset (Top level container)\n<ul>\n<li>Meta Data\n<ul>\n<li>Zone<\/li>\n<li>Encryption keys<\/li>\n<\/ul>\n<\/li>\n<\/ul>\n<\/li>\n<li>After the dataset has been create, you can add tables.\n<ul>\n<li>Table1<\/li>\n<li>Table2&#8230;<\/li>\n<\/ul>\n<\/li>\n<li>Dataset Names\n<ul>\n<li>1024 characters (aA,09,_)\n<ul>\n<li>No spaces, hyphens, special chars<\/li>\n<\/ul>\n<\/li>\n<li>Case sensitive\n<ul>\n<li>DATASET1 and dataset1 can coexist<\/li>\n<\/ul>\n<\/li>\n<\/ul>\n<\/li>\n<\/ul>\n<p>How to Create a Dataset<\/p>\n<ul>\n<li>Setup Billing\n<ul>\n<li>[ Upgrade ]\n<ul>\n<li>Won&#8217;t be billed, but not setting this up will restrict access to some features.<\/li>\n<\/ul>\n<\/li>\n<\/ul>\n<\/li>\n<li>Select the Project &gt; Create Dataset\n<ul>\n<li>Provide unique name<\/li>\n<li>Select Location (US by default)\n<ul>\n<li>Regions can ONLY be selected at time of creation!<\/li>\n<\/ul>\n<\/li>\n<\/ul>\n<\/li>\n<\/ul>\n<h3>Regions vs Multi-Region<\/h3>\n<ul>\n<li>Region is a specific Geographic place.\n<ul>\n<li>Examples\n<ul>\n<li>Las Vegas<\/li>\n<li>Tokyo<\/li>\n<\/ul>\n<\/li>\n<li>Select a region close to you<\/li>\n<\/ul>\n<\/li>\n<li>Multi-Regons are large geographic locations with multiple regions\n<ul>\n<li>Examples\n<ul>\n<li>United States<\/li>\n<li>European Union<\/li>\n<\/ul>\n<\/li>\n<li>Use multi-region when you know your data is going to be access from overall US or overall EU.<\/li>\n<li>This will make copies of your data to multiple regions.\n<ul>\n<li>Stored in one region but backed up to multiple regions<\/li>\n<li>Adds Resiliency<\/li>\n<li>Adds Cost<\/li>\n<\/ul>\n<\/li>\n<\/ul>\n<\/li>\n<li>Select a region carefully\n<ul>\n<li>Can affect cost<\/li>\n<li>Can affect performance<\/li>\n<\/ul>\n<\/li>\n<li>If using data NOT in BigQuery (storage bucket?)\n<ul>\n<li>Dataset MUST be in the same region as the external data.<\/li>\n<\/ul>\n<\/li>\n<\/ul>\n<h2>17. Create a Dataset<\/h2>\n<p>https:\/\/www.udemy.com\/course\/bigquery\/learn\/lecture\/22718883#questions<\/p>\n<ul>\n<li>Can create any number of Datasets in a project\n<ul>\n<li>&gt; 1000 Dashboard performance can degrade<\/li>\n<\/ul>\n<\/li>\n<li>Unlimited tables\n<ul>\n<li>~ 50,000 enumerating slows<\/li>\n<li>This is the max the UI will display<\/li>\n<\/ul>\n<\/li>\n<li>Max 2500 authorized views to a dataset&#8217;s control list<\/li>\n<\/ul>\n<h2>18. Create a Table<\/h2>\n<p>https:\/\/www.udemy.com\/course\/bigquery\/learn\/lecture\/22718891#questions<\/p>\n<ul>\n<li>Download the names.zip file and extract.\n<ul>\n<li>Name, Gender, SS-last 4<\/li>\n<\/ul>\n<\/li>\n<li>Select Dataset &gt; Create Table\n<ul>\n<li>Create table from: Upload\n<ul>\n<li>You do not need to first create a table for uploaded data.<\/li>\n<\/ul>\n<\/li>\n<li>Select file\n<ul>\n<li>&lt; 10MB<\/li>\n<li>&lt; 16,000 rows<\/li>\n<li>Multiple files should be uploaded individually<\/li>\n<\/ul>\n<\/li>\n<li>File Format: CSV<\/li>\n<li>Destination\n<ul>\n<li>Search for a project &gt; Select project name<\/li>\n<li>Table type\n<ul>\n<li>Native: Stored in BigQuery<\/li>\n<li>External: Backed by storage external to BigQuery, like Cloud storage<\/li>\n<\/ul>\n<\/li>\n<\/ul>\n<\/li>\n<li>Table name: &#8220;names&#8221;<\/li>\n<li>Schema\n<ul>\n<li>Manual (Do not select Auto)<\/li>\n<li>+ Add Field (column)\n<ul>\n<li>Note:\n<ul>\n<li>Columns must contain only letters, numbers or underscore<\/li>\n<li>Max Len: 128 chars<\/li>\n<li>Cannot use: _TABLE_, _FILE_, or _PARTITION<\/li>\n<li>Not case sensitive, &#8220;COL1&#8221; and &#8220;col1&#8221; CANNOT coexist.<\/li>\n<\/ul>\n<\/li>\n<li>Name: &#8220;name&#8221;\n<ul>\n<li>Type: String<\/li>\n<li>Mode: Nullable\n<ul>\n<li>Nullable: allows field to be null<\/li>\n<li>Required: Null not allowed<\/li>\n<li>Repeated: Contains array of values of the specified type.<\/li>\n<\/ul>\n<\/li>\n<\/ul>\n<\/li>\n<\/ul>\n<\/li>\n<li>+ Add Field: &#8220;gender&#8221; (string, nullable)<\/li>\n<li>+ Add Field: &#8220;count&#8221; (integer, nullable)<\/li>\n<\/ul>\n<\/li>\n<li>Advanced\n<ul>\n<li>Write preference\n<ul>\n<li>Write if empty<\/li>\n<li>Append to table<\/li>\n<li>Overwrite table<\/li>\n<\/ul>\n<\/li>\n<li>Number of errors: 0<\/li>\n<li>Unknown values: [ _ ] Ignore unknown values\n<ul>\n<li>When enabled, this would ignore rows with extra values in them<\/li>\n<\/ul>\n<\/li>\n<li>Field delimiter: Comma<\/li>\n<li>Header rows to skip: 0<\/li>\n<li>Quoted newlines:<\/li>\n<li>Jagged rows: Check\n<ul>\n<li>if last value is null and not specified as such ( &#8230;,&#8221;&#8221;)<\/li>\n<li>and this is not checked, the row will be treated as bad data.<\/li>\n<\/ul>\n<\/li>\n<li>Encryption: Google managed key<\/li>\n<\/ul>\n<\/li>\n<li>[Create table]<\/li>\n<\/ul>\n<\/li>\n<\/ul>\n<h3>After the table is created you can:<\/h3>\n<ul>\n<li>View\/Edit the schema<\/li>\n<li>View\/Edit Details (expiration date)<\/li>\n<li><strong>Preview<\/strong> &#8211; Use this!!!\n<ul>\n<li>Similar to <code>select *<\/code> but free!\n<ul>\n<li>You will be charged for `select *`<\/li>\n<\/ul>\n<\/li>\n<\/ul>\n<\/li>\n<\/ul>\n<h3>Query Charges<\/h3>\n<ul>\n<li>How much data is stored in the tables and its partitions<\/li>\n<li>Data extracted from the query (not clear)<\/li>\n<li>Table creation is free<\/li>\n<\/ul>\n<h3>Limitations<\/h3>\n<ul>\n<li>1500 operations per table per day\n<ul>\n<li>Includes the combined total of\n<ul>\n<li>Appends and Truncates<\/li>\n<li>Load jobs, Copy and Query Jobs that append to or overwrite a destination table<\/li>\n<li>Example\n<ul>\n<li>600 copy jobs + 900 query jobs<\/li>\n<\/ul>\n<\/li>\n<\/ul>\n<\/li>\n<\/ul>\n<\/li>\n<li>Max 10,000 columns in a Table, Query result or View<\/li>\n<\/ul>\n","protected":false},"excerpt":{"rendered":"<p>Main Menu Section 3: Dataset and Table Creation 12. Setup a GCP account https:\/\/www.udemy.com\/course\/bigquery\/learn\/lecture\/22719631#overview Steps to signup for a free Google cloud setup and get $300 for 12 months in your account Go to cloud.google.com Click on &#8220;Get started for free&#8221; Enter your Gmail id and password. Accept the Terms of service and click Continue. ..<\/p>\n<div class=\"clear-fix\"><\/div>\n<p><a href=\"https:\/\/wiki.thomasandsofia.com\/?p=3719\" title=\"read more...\">Read more<\/a><\/p>\n","protected":false},"author":1,"featured_media":0,"comment_status":"open","ping_status":"open","sticky":false,"template":"","format":"standard","meta":{"footnotes":""},"categories":[77,78],"tags":[],"class_list":["post-3719","post","type-post","status-publish","format-standard","hentry","category-bigquery","category-bigquery-for-big-data-engineers"],"_links":{"self":[{"href":"https:\/\/wiki.thomasandsofia.com\/index.php?rest_route=\/wp\/v2\/posts\/3719","targetHints":{"allow":["GET"]}}],"collection":[{"href":"https:\/\/wiki.thomasandsofia.com\/index.php?rest_route=\/wp\/v2\/posts"}],"about":[{"href":"https:\/\/wiki.thomasandsofia.com\/index.php?rest_route=\/wp\/v2\/types\/post"}],"author":[{"embeddable":true,"href":"https:\/\/wiki.thomasandsofia.com\/index.php?rest_route=\/wp\/v2\/users\/1"}],"replies":[{"embeddable":true,"href":"https:\/\/wiki.thomasandsofia.com\/index.php?rest_route=%2Fwp%2Fv2%2Fcomments&post=3719"}],"version-history":[{"count":12,"href":"https:\/\/wiki.thomasandsofia.com\/index.php?rest_route=\/wp\/v2\/posts\/3719\/revisions"}],"predecessor-version":[{"id":3733,"href":"https:\/\/wiki.thomasandsofia.com\/index.php?rest_route=\/wp\/v2\/posts\/3719\/revisions\/3733"}],"wp:attachment":[{"href":"https:\/\/wiki.thomasandsofia.com\/index.php?rest_route=%2Fwp%2Fv2%2Fmedia&parent=3719"}],"wp:term":[{"taxonomy":"category","embeddable":true,"href":"https:\/\/wiki.thomasandsofia.com\/index.php?rest_route=%2Fwp%2Fv2%2Fcategories&post=3719"},{"taxonomy":"post_tag","embeddable":true,"href":"https:\/\/wiki.thomasandsofia.com\/index.php?rest_route=%2Fwp%2Fv2%2Ftags&post=3719"}],"curies":[{"name":"wp","href":"https:\/\/api.w.org\/{rel}","templated":true}]}}