Section 1: Intro to GCP and its services
Section 2: Intro to BigQuery
8. Conventional Data Warehouse Problems
9. What is BigQuery
BigQuery is a fully managed, serverless, highly scalable and cost-effective cloud Data Warehouse designed for business agility.
- Both Batch and Streaming data ingestion
- Can store 100,000 rows per second
- TB of batch data per second
- Supports AI and ML
- BigQuery ML
- Integration with the AI Platform
- Prediction and TensorFlow
- Full managed
- Scalability
- Pay as you go
- Pay separately for storage and compute
- Pay for bytes that your query processes
- Results cached, so no need to pay for same query 2x
- Automated data transfer
- Fully managed data transfer
- Transfer from Teradata and S3 to BigQuery
- Access control
- Use IAM
- Assign read-write, running jobs, etc. per project.
10. OOB Features
https://www.udemy.com/course/bigquery/learn/lecture/22717593#overview
- BQ GIS
- Geographic Information System
- Obtain insights from geographic data points using Long/Lat
- Auto Backup
- 7 days
- Integration with other GCP
- DataProc
- Foundation for BI
- Seamless integration, transformation, analysis, visualization
- Programmatic Interaction
- REST API
- Libraries in Java, Python, Node.js, c#, Go, Ruby and PHP
- Security
- At rest and transit
- Each data block encrypted with different keys
- Logging, Monitoring and alerting
- Cloud Audit Logs
- Federated queries
- Process data in Object Storage
- Parquet, ORC, Open source
- Process transactional databases
- BigTable, Cloud SQL, spreadsheets in Drive
- You can pull data directly from a CSV file…
- BigTable, Cloud SQL, spreadsheets in Drive
- Process data in Object Storage
- Data Science Workloads
- Spark, TensorFlow, scikit-learn
- No need to have multiple copies of the same data
- Powerful data repository
11. Architecture of BigQuery
https://www.udemy.com/course/bigquery/learn/lecture/22717627#overview
- Engine – Dremel
- Combination of columnar data layouts and tree architecture
- File system – Colossus
- Columnar storage, Google’s distributed filesystem
