Convert Jupyter notebooks to PDF

Jupyter lab is the next-generation web-based UI experience for Jupyter notebook users. It facilitates a tab-based programming interface that is highly extensible and interactive. It supports 40+ programming languages. We have already discussed how we can use Jupyter notebooks for interactive data analysis with SQL Server. With the help of Jupyter notebooks, we can keep […]

Convert Jupyter notebooks to PDF Read More »

Interactive Data Analysis with SQL Server using Jupyter Notebooks

In this post “Interactive Data Analysis with SQL Server using Jupyter Notebooks“, we will demonstrate how we can use Jupyter Notebooks for interactive data analysis with SQL Server. Jupyter notebooks are one of the most useful tools for any Data Scientist/Data Analyst. It supports 40+ programming languages and facilitates web-based interactive programming IDE. We can

Interactive Data Analysis with SQL Server using Jupyter Notebooks Read More »

Data compression in Hive – An Introduction to Hadoop Data Compression

Data compression is a technique that encodes the original data in such a way so that it can be represented with fewer bits on the disk. The data compression process is used to reduce the size of the data files on the disk. We know that the Hadoop framework is meant for large scale data

Data compression in Hive – An Introduction to Hadoop Data Compression Read More »

Python use case – Export SQL table data to excel and CSV files – SQL Server 2017

In this post, we are going to discuss how we can export SQL Server table data to an Excel file or to a CSV file using Python’s pandas library. Prior to SQL Server 2017, we could use one of the below methods to export data from SQL Server to Excel or CSV file: Create an

Python use case – Export SQL table data to excel and CSV files – SQL Server 2017 Read More »

SQL Server – Error 1061: The service cannot accept control messages at this time

Sometimes when we try to restart “SQL Server service” we might get an error “Windows could not stop the SQL Server (MSSQLSERVER) service on Local Computer” with error code and description “Error 1061: The service cannot accept control messages at this time“. In this post, “SQL Server – Error 1061: The service cannot accept control

SQL Server – Error 1061: The service cannot accept control messages at this time Read More »

Read and write data to SQL Server from Spark using pyspark

Apache Spark is a very powerful general-purpose distributed computing framework. It provides a different kind of data abstractions like RDDs, DataFrames, and DataSets on top of the distributed collection of the data. Spark is highly scalable Big data processing engine which can run on a single cluster to thousands of clusters. To follow this exercise,

Read and write data to SQL Server from Spark using pyspark Read More »

Install Spark on Windows (Local machine) with PySpark – Step by Step

Apache Spark is a general-purpose big data processing engine. It is a very powerful cluster computing framework which can run from a single cluster to thousands of clusters. It can run on clusters managed by Hadoop YARN, Apache Mesos, or by Spark’s standalone cluster manager itself. To read more on Spark Big data processing framework,

Install Spark on Windows (Local machine) with PySpark – Step by Step Read More »

Change Jupyter Notebook startup folder on Windows and Mac OS

Once we have installed the Jupyter notebook, we can start it by executing “jupyter notebook” command in the command prompt on a Windows machine or in the terminal on a Mac machine. Jupyter notebook is a very useful web-based application which can be used to write programs in many programming languages like Python, R, Scala,

Change Jupyter Notebook startup folder on Windows and Mac OS Read More »

The RPC server is unavailable – SQL Server 2017 installation error

During the installation of SQL Server 2017(Or other versions), we can get an error “The RPC server is unavailable” at the very last step of the installation process while executing the action “DReplayControllerConfigAction_install_postmsi_Cpu64“. “The RPC server unavailable error” might also occur at the “Server Configuration” step during the installation process. However, typically this error occurs

The RPC server is unavailable – SQL Server 2017 installation error Read More »

RDD, DataFrame, and DataSet – Introduction to Spark Data Abstraction

Apache Spark is a general purpose distributed computing engine used for Big Data processing – Batch and stream processing. It provides high level APIs like Spark SQL, Spark Streaming, MLib, and GraphX to allow interaction with core functionalities of Apache Spark. Spark also facilitates several core data abstractions on top of the distributed collection of

RDD, DataFrame, and DataSet – Introduction to Spark Data Abstraction Read More »