Artificial Intelligence (AI)
- Intro to Machine Learning – from Kaggle Intro to Machine learning with explanations and practice
- Julius AI Julius is a powerful AI data analyst that helps you analyze and visualize your data. Chat with your data, create graphs, build forecasting models, and more.
- Machine Learning (Overview) – from Wikiwand Study of algorithms that improve automatically through experience
- Machine Learning Tutorial – from GeeksforGeeks Machine Learning tutorial covers basic and advanced concepts, specially designed to cater to both students and experienced working professionals.
- N8N N8N helps you to connect any app with an API with any other, and manipulate its data with little or no code.
- Welcome AI A platform to research and automatically generate a successful AI strategy. Aggregated industry research, use cases, case studies and educational programs. Discover the potential of AI and make it a reality for your business.
Data Analysis
- Beautiful Soup – Python (Module Documentation) Web Scrapping with Python – Beautiful Soup Module
- Crawlbase A company with the philosophy for caring about data and loving the freedom of internet. Excellent tutorial and how-to information.
- DataPrep (Python library) The easiest way to prepare data in Python
- EDA (Exploratory Data Analysis) In statistics, exploratory data analysis is an approach of analyzing data sets to summarize their main characteristics, often using statistical graphics and other data visualization methods.
- ExcelJet – Quick, clean, and to the point Excel formulas and more
- Julius AI Julius is a powerful AI data analyst that helps you analyze and visualize your data. Chat with your data, create graphs, build forecasting models, and more.
- Presto SQL and Data Query Engine
Data Engineering Software and Tools
- Julius AI Julius is a powerful AI data analyst that helps you analyze and visualize your data. Chat with your data, create graphs, build forecasting models, and more.
- N8N N8N helps you to connect any app with an API with any other, and manipulate its data with little or no code.
- Presto SQL and Data Query Engine
- PyTest Module The pytest framework makes it easy to write small, readable tests, and can scale to support complex functional testing for applications and libraries.
- Redash Redash supports SQL, NoSQL, Big Data and API data sources – query your data from different sources to answer complex questions.
- Superset Apache Superset™ is an open-source modern data exploration and visualization platform.
- Upsolver UpSolver built cloud-native (read infinitely-scalable) technology for ingesting change data from operational databases and streaming data from message busses in prod.
Data Mapping
- Data Mapping (using Python) Data Mapping – Techniques using pyton Numpy and Pandas
- Python Mappings: A Comprehensive Guide Dictionaries are the most common and well-known of Python’s mappings. However, there are other mappings in Python’s standard library and third-party modules. Mappings share common characteristics, and understanding these shared traits will help you use them more effectively.
- What is Data Mapping – 101 Guide What is Data Mapping? – Intro guide
Data Processing
- Crawlbase A company with the philosophy for caring about data and loving the freedom of internet. Excellent tutorial and how-to information.
- CronTab Guru: Cron Jobs tutorial The quick and simple editor for cron schedule expressions by Cronitor
- Dask Website Dask was developed to natively scale these packages and the surrounding ecosystem to multi-core machines and distributed clusters when datasets exceed memory.
- OLAP vs OLTP what is the difference between OLAP Vs OLTP data processing systems? (IBM)
- Spark Latest Documentation Documentation for the latest version of Apache Sparks
Data Science
- Daily Dose of Data Science Daily Dose of Data Science brings together intriguing frameworks, libraries, technologies, and tips that make the life cycle of a Data Science project effortless.
- Dask Website Dask was developed to natively scale these packages and the surrounding ecosystem to multi-core machines and distributed clusters when datasets exceed memory.
- Data Science Ethics Course about ethic in Data Science and Data Analytics field
- Datagy Datagy is a site that makes learning different data science and Python skills intuitive and easy to understand.
- Kaggle Data Science community Kaggle's community is a diverse group of 20 million data scientists, ML engineers & enthusiasts from around the world focus on learning data science & ML, stay up-to-date on the latest techniques, and collaborate.
Data Storage
- Apache Parquet Apache Parquet is an open source, column-oriented data file format designed for efficient data storage and retrieval. It provides efficient data compression and encoding schemes with enhanced performance to handle complex data in bulk. Parquet is available in multiple languages including Java, C++, Python, etc…
- Upsolver UpSolver built cloud-native (read infinitely-scalable) technology for ingesting change data from operational databases and streaming data from message busses in prod.
Data Visualization
- Book: The Grammar of Graphics
- ggplot2 Data Visualization library for R
- ggplot2 – Online book ggplot2: elegant graphics for data analysis
- MatPlotLib Matplotlib: Visualization with Python
- Pyhton Graph Gallery A collection of hundreds of charts made with Python.
- R Graph Gallery A collection of charts made with the R programming language
- Redash Redash supports SQL, NoSQL, Big Data and API data sources – query your data from different sources to answer complex questions.
- Superset Apache Superset™ is an open-source modern data exploration and visualization platform.
Databases
- MSSQL-Tips Useful tips and trick for using MS SQL Server
Datasets
- Our World in Data Research and data to make progress against the world’s largest problems
Excel
- ExcelJet – Quick, clean, and to the point Excel formulas and more
Machine Learning
- Intro to Machine Learning – from Kaggle Intro to Machine learning with explanations and practice
- Machine Learning (Overview) – from Wikiwand Study of algorithms that improve automatically through experience
- Machine Learning Tutorial – from GeeksforGeeks Machine Learning tutorial covers basic and advanced concepts, specially designed to cater to both students and experienced working professionals.
- MLFlow MLflow is an open-source platform, purpose-built to assist machine learning practitioners and teams in handling the complexities of the machine learning process. MLflow focuses on the full lifecycle for machine learning projects, ensuring that each phase is manageable, traceable, and reproducible.
Markdown
- How to use Kaggle Notebooks
- R Markdown Markdown format in R Studio
- R Markdown reference guide R Studio Markdown reference guide
- R Markdown: The Definitive Guide (Online Book) A guide of Markdown in R
- Shiny on R Studio tutorial Tutorial for Shiny package in R Studio
- The Jupyter Notebook Introduction to Jupyter Notebooks
- The Jupyter Notebook Formatting Guide
- Using Markdown in Jupyter Notebook
Organizations
- DAMA International DAMA International is a not-for-profit, vendor-independent, global association of technical and business professionals dedicated to advancing the concepts and practices of information and data management.
- International Society of Chief Data Officers We are the premiere, vendor-neutral professional organization established for and by CDOs to promote data leadership.
Pandas
- Pandas API Reference (stable version) Pandas API Reference documentation
- Pandas Community Tutorials Community tutorial from Pandas (Python) Website
- TutorialKart – Pandas Tutorial and more
PySpark
- PySpark – Documentation Official PySpark Documentation
- PySpark – Tutorial
- PySpark Package
Python
- Beautiful Soup – Python (Module Documentation) Web Scrapping with Python – Beautiful Soup Module
- Dask Website Dask was developed to natively scale these packages and the surrounding ecosystem to multi-core machines and distributed clusters when datasets exceed memory.
- Data Mapping (using Python) Data Mapping – Techniques using pyton Numpy and Pandas
- Data Scraping with Python Useful tutorial for Web Scraping with Python
- Data Scraping with Python (Geeks for Geeks) Useful and simple tutorial for Web Scraping with Python
- Datagy Datagy is a site that makes learning different data science and Python skills intuitive and easy to understand.
- DataPrep (Python library) The easiest way to prepare data in Python
- Django Tutorial (Server-side Web Development) Django tutorial from Mozilla
- Google Colab (Intro) Introduction to Google Colab
- How to use Kaggle Notebooks
- MatPlotLib Matplotlib: Visualization with Python
- Pandas API Reference (stable version) Pandas API Reference documentation
- Pandas Community Tutorials Community tutorial from Pandas (Python) Website
- Pyhton Graph Gallery A collection of hundreds of charts made with Python.
- PYNative Python Tutorials, Exercises and more.
- PySpark – Documentation Official PySpark Documentation
- PySpark Package
- PyTest Module The pytest framework makes it easy to write small, readable tests, and can scale to support complex functional testing for applications and libraries.
- Python Mappings: A Comprehensive Guide Dictionaries are the most common and well-known of Python’s mappings. However, there are other mappings in Python’s standard library and third-party modules. Mappings share common characteristics, and understanding these shared traits will help you use them more effectively.
- Python strftime cheatsheet Standard format for date/time datatype
- PyTutorial Python Tutorials
- The Jupyter Notebook Introduction to Jupyter Notebooks
- The Jupyter Notebook Formatting Guide
- TutorialKart – Pandas Tutorial and more
R
- Bias Function in R The bias function in R can help to identify and manage bias in data analysis
- ggplot2 Data Visualization library for R
- ggplot2 – Online book ggplot2: elegant graphics for data analysis
- Introduction to R A tutorial with basics about R programming
- R for Data Science – Free Web Book Free Website for the Book "R for Data Science".
- R Graph Gallery A collection of charts made with the R programming language
- R Ladies Sidney – Courses Various tips for R programming
- R Markdown Markdown format in R Studio
- R Markdown reference guide R Studio Markdown reference guide
- R Markdown: The Definitive Guide (Online Book) A guide of Markdown in R
- readxl The readxl package makes it easy to get data out of Excel and into R.
- Shiny on R Studio tutorial Tutorial for Shiny package in R Studio
- The R Datasets Package Datasets examples for R
- The Tidyverse Cookbook Online book that collects code recipes for doing data science with R’s tidyverse.
- Tibble About R Tibbles
- Tidyverse Tidyverse package for R. Official site.
Tutorials
- Apache Parquet Apache Parquet is an open source, column-oriented data file format designed for efficient data storage and retrieval. It provides efficient data compression and encoding schemes with enhanced performance to handle complex data in bulk. Parquet is available in multiple languages including Java, C++, Python, etc…
- Beautiful Soup – Python (Module Documentation) Web Scrapping with Python – Beautiful Soup Module
- Crawlbase A company with the philosophy for caring about data and loving the freedom of internet. Excellent tutorial and how-to information.
- CronTab Guru: Cron Jobs tutorial The quick and simple editor for cron schedule expressions by Cronitor
- Daily Dose of Data Science Daily Dose of Data Science brings together intriguing frameworks, libraries, technologies, and tips that make the life cycle of a Data Science project effortless.
- Data Scraping with Python Useful tutorial for Web Scraping with Python
- Data Scraping with Python (Geeks for Geeks) Useful and simple tutorial for Web Scraping with Python
- Datagy Datagy is a site that makes learning different data science and Python skills intuitive and easy to understand.
- Django Tutorial (Server-side Web Development) Django tutorial from Mozilla
- ExcelJet – Quick, clean, and to the point Excel formulas and more
- IBM Topics Learn about popular topics in technology from IBM experts
- Intro to Machine Learning – from Kaggle Intro to Machine learning with explanations and practice
- Kaggle Data Science community Kaggle's community is a diverse group of 20 million data scientists, ML engineers & enthusiasts from around the world focus on learning data science & ML, stay up-to-date on the latest techniques, and collaborate.
- Machine Learning Tutorial – from GeeksforGeeks Machine Learning tutorial covers basic and advanced concepts, specially designed to cater to both students and experienced working professionals.
- MSSQL-Tips Useful tips and trick for using MS SQL Server
- Pandas Community Tutorials Community tutorial from Pandas (Python) Website
- PYNative Python Tutorials, Exercises and more.
- PyTutorial Python Tutorials
- Tutorial Deep
- TutorialKart – Pandas Tutorial and more
- VBA for beginners Tutorial for VBA Tutorial for Beginners
- Visual Basic documentation (Microsoft Website) Tutorial for Visual Basic from Microsoft Documentatioin website