Introduction to superset deployment, installation and use

Superset overview

Apache Superset is an open source, modern and lightweight BI analysis tool. It can connect with a variety of data sources, has rich icon display forms, supports custom dashboards, and has a friendly user interface, which is very easy to use.

Superset application scenario

Because Superset can connect with common big data analysis tools, such as Hive, Kylin, Druid, etc., and supports custom dashboard, it can be used as a visualization tool for data warehouse.

Superset installation and use

Superset official website address: https://superset.apache.org/
GitHub source address: https://github.com/apache/superset

Install Python environment

Superset is a Web application written in Python language. The project development team is in version 3.6, so Python 3 The environment of 6 is the most stable.

Installing Miniconda

CONDA is an open source package and environment manager, which can be used to install different Python versions of software packages and their dependencies on the same machine, and can switch between different Python environments. Anaconda includes CONDA, Python and a lot of installed toolkits, such as numpy, panda, etc. Miniconda includes CONDA and python.

Download the latest version of Miniconda3

wangting@ops04:/opt/software >wget https://repo.anaconda.com/miniconda/Miniconda3-latest-Linux-x86_64.sh

Installing Miniconda

wangting@ops04:/opt/software >bash Miniconda3-latest-Linux-x86_64.sh
In order to continue the installation process, please review the license
agreement.
Please, press ENTER to continue	 # [continue] enter
>>> 
Do you accept the license terms? [yes|no]
[no] >>> 
Please answer 'yes' or 'no':'	# [agree to some terms] yes
>>> yes


Miniconda3 will now be installed into this location:
/home/wangting/miniconda3

  - Press ENTER to confirm the location
  - Press CTRL-C to abort the installation
  - Or specify a different location below

[/home/wangting/miniconda3] >>> /opt/module/miniconda3	# [custom installation path default home directory]


Do you wish the installer to initialize Miniconda3
by running conda init? [yes|no]		# Run conda initialization yes
[no] >>> yes

Thank you for installing Miniconda3!	# When this prompt appears, the installation is complete

When the script is running, the environment parameters are automatically added to the bashrc environment file in the user's home directory

__conda_setup="$('/opt/module/miniconda3/bin/conda' 'shell.bash' 'hook' 2> /dev/null)"
if [ $? -eq 0 ]; then
    eval "$__conda_setup"
else
    if [ -f "/opt/module/miniconda3/etc/profile.d/conda.sh" ]; then
        . "/opt/module/miniconda3/etc/profile.d/conda.sh"
    else
        export PATH="/opt/module/miniconda3/bin:$PATH"
    fi
fi
unset __conda_setup

Reference the bashrc file of the home directory modified by the script

wangting@ops04:/opt/module/miniconda3 >source ~/.bashrc
(base) wangting@ops04:/opt/module/miniconda3 >

Exit base environment mode

(base) wangting@ops04:/opt/module/miniconda3 >conda deactivate

Cancel each login to activate the base environment (each login to the terminal, use the command line to log in to the environment)

After the Miniconda installation is completed, the default base environment will be activated every time the terminal is opened. The automatic activation of the default base environment is prohibited through the following command.

wangting@ops04:/opt/module/miniconda3 >conda config --set auto_activate_base false

Configure conda domestic image

wangting@ops04:/opt/module/miniconda3 >conda config --add channels https://mirrors.tuna.tsinghua.edu.cn/anaconda/pkgs/free
wangting@ops04:/opt/module/miniconda3 >conda config --add channels https://mirrors.tuna.tsinghua.edu.cn/anaconda/pkgs/main
wangting@ops04:/opt/module/miniconda3 >conda config --set show_channel_urls yes

Create Python environment

Create python 3 6 environment -- login name of custom virtual environment after name(-n) = custom python version after name(-n)

wangting@ops04:/home/wangting >conda create --name superset python=3.6
Proceed ([y]/n)? y			# Wait until the creation is completed, and the installation package and dependencies will be downloaded during the process

Executing transaction: done
#
# To activate this environment, use
#
#     $ conda activate superset
#
# To deactivate an active environment, use
#
#     $ conda deactivate
#See that the above content is created

[note] if warning: a new version of conda exists. Is prompted when conda create is executed above, Update to update conda. Update is not required for successful installation

wangting@ops04:/home/wangting >conda create --name superset python=3.6
Collecting package metadata (current_repodata.json): done
Solving environment: done
==> WARNING: A newer version of conda exists. <==
  current version: 4.7.12
  latest version: 4.10.1

Please update conda by running

    $ conda update -n base -c defaults conda

Segmentation fault (core dumped)
wangting@ops04:/home/wangting >conda update -n base -c defaults conda

conda environment management common commands

View all conda environments

wangting@ops04:/opt/module/miniconda3/pkgs >conda info --envs
# conda environments:
#
base                  *  /opt/module/miniconda3
superset                 /opt/module/miniconda3/envs/superset

Activate the corresponding conda environment for login

wangting@ops04:/opt/module/miniconda3/pkgs >conda activate superset
(superset) wangting@ops04:/opt/module/miniconda3/pkgs >
(superset) wangting@ops04:/opt/module/miniconda3/pkgs >python --version
Python 3.6.13 :: Anaconda, Inc.

Exit the current conda environment

(superset) wangting@ops04:/opt/module/miniconda3/pkgs >conda deactivate
wangting@ops04:/opt/module/miniconda3/pkgs >

Verification function

# Log in again
wangting@ops04:/opt/module/miniconda3/pkgs >conda activate superset
(superset) wangting@ops04:/opt/module/miniconda3/pkgs >
# Log in to the python command line of conda
(superset) wangting@ops04:/opt/module/miniconda3/pkgs >python
Python 3.6.13 |Anaconda, Inc.| (default, Jun  4 2021, 14:25:59) 
[GCC 7.5.0] on linux
Type "help", "copyright", "credits" or "license" for more information.
>>> exit()
# In conda, use the PIP command to install the module - i to specify the resource address, which is the official address by default; Verify pip
(superset) wangting@ops04:/opt/module/miniconda3/pkgs >pip install gunicorn -i https://pypi.douban.com/simple/
Looking in indexes: https://pypi.douban.com/simple/
Collecting gunicorn
  Downloading https://pypi.doubanio.com/packages/e4/dd/5b190393e6066286773a67dfcc2f9492058e9b57c4867a95f1ba5caf0a83/gunicorn-20.1.0-py3-none-any.whl (79 kB)
     |████████████████████████████████| 79 kB 2.3 MB/s 
Requirement already satisfied: setuptools>=3.0 in /opt/module/miniconda3/envs/superset/lib/python3.6/site-packages (from gunicorn) (52.0.0.post20210125)
Installing collected packages: gunicorn
Successfully installed gunicorn-20.1.0
(superset) wangting@ops04:/opt/module/miniconda3/pkgs >

Add another conda environment verification -- name can be abbreviated as - n; When the python version is not specified, the default is 2.7

# The process is the same as the conda environment with superset above
wangting@ops04:/opt/module/miniconda3/pkgs >conda create -n wangting python=3.6
wangting@ops04:/opt/module/miniconda3/pkgs >
wangting@ops04:/opt/module/miniconda3/pkgs >conda info --envs
# conda environments:
#
base                  *  /opt/module/miniconda3
superset                 /opt/module/miniconda3/envs/superset
wangting				 /opt/module/miniconda3/envs/wangting

wangting@ops04:/opt/module/miniconda3/pkgs >conda activate wangting
(wangting) wangting@ops04:/opt/module/miniconda3/pkgs >

Superset deployment

Superset official website address: http://superset.apache.org/

Before installing Superset, install the following required dependencies

wangting@ops04:/opt/software >sudo yum install -y python-setuptools
wangting@ops04:/opt/software >sudo yum install -y gcc gcc-c++ libffi-devel python-devel python-pip python-wheel openssl-devel cyrus-sasl-devel openldap-devel

Log in to CONDA superset environment to install and deploy

wangting@ops04:/opt/software >conda activate superset
(superset) wangting@ops04:/opt/software >

Install (update) setuptools and pip

(superset) wangting@ops04:/opt/software >pip install --upgrade setuptools pip -i https://pypi.douban.com/simple/
Looking in indexes: https://pypi.douban.com/simple/
Requirement already satisfied: setuptools in /opt/module/miniconda3/envs/superset/lib/python3.6/site-packages (52.0.0.post20210125)
Collecting setuptools
  Downloading https://pypi.doubanio.com/packages/4e/78/56aa1b5f4d8ac548755ae767d84f0be54fdd9d404197a3d9e4659d272348/setuptools-57.0.0-py3-none-any.whl (821 kB)
     |████████████████████████████████| 821 kB 2.4 MB/s 
Requirement already satisfied: pip in /opt/module/miniconda3/envs/superset/lib/python3.6/site-packages (21.1.2)
Installing collected packages: setuptools
  Attempting uninstall: setuptools
    Found existing installation: setuptools 52.0.0.post20210125
    Uninstalling setuptools-52.0.0.post20210125:
      Successfully uninstalled setuptools-52.0.0.post20210125
Successfully installed setuptools-57.0.0
(superset) wangting@ops04:/opt/software >

Install superset

# Apache superset will install a series of dependent modules and wait for the installation to complete
(superset) wangting@ops04:/opt/software >pip install apache-superset -i https://pypi.douban.com/simple/

Initialize Supetset database

(superset) wangting@ops04:/opt/software >superset db upgrade
Traceback (most recent call last):
  File "/opt/module/miniconda3/envs/superset/bin/superset", line 5, in <module>
    from superset.cli import superset
  File "/opt/module/miniconda3/envs/superset/lib/python3.6/site-packages/superset/__init__.py", line 21, in <module>
    from superset.app import create_app
  File "/opt/module/miniconda3/envs/superset/lib/python3.6/site-packages/superset/app.py", line 45, in <module>
    from superset.security import SupersetSecurityManager
  File "/opt/module/miniconda3/envs/superset/lib/python3.6/site-packages/superset/security/__init__.py", line 17, in <module>
    from superset.security.manager import SupersetSecurityManager  # noqa: F401
  File "/opt/module/miniconda3/envs/superset/lib/python3.6/site-packages/superset/security/manager.py", line 44, in <module>
    from superset import sql_parse
  File "/opt/module/miniconda3/envs/superset/lib/python3.6/site-packages/superset/sql_parse.py", line 18, in <module>
    from dataclasses import dataclass
ModuleNotFoundError: No module named 'dataclasses'
# Prompt ERROR and report an ERROR. If the dataclasses module cannot be found, install it according to the ERROR
(superset) wangting@ops04:/opt/software >pip install dataclasses
Collecting dataclasses
  Downloading dataclasses-0.8-py3-none-any.whl (19 kB)
Installing collected packages: dataclasses
Successfully installed dataclasses-0.8
# Try initialization again, done
(superset) wangting@ops04:/opt/software >superset db upgrade

Create administrator user

(superset) wangting@ops04:/opt/software >export FLASK_APP=superset
(superset) wangting@ops04:/opt/software >flask fab create-admin
Username [admin]: 				# Enter is used to log in the management user of the management page
User first name [admin]: 		# Enter user information
User last name [user]: 			# Enter user information
Email [admin@fab.org]: 			# Enter email information
Password: 				        # Set the password 123456 to log in the management user password of the management page
Repeat for confirmation: 		# Duplicate password 123456
logging was configured successfully
INFO:superset.utils.logging_configurator:logging was configured successfully
/opt/module/miniconda3/envs/superset/lib/python3.6/site-packages/flask_caching/__init__.py:202: UserWarning: Flask-Caching: CACHE_TYPE is set to null, caching is effectively disabled.
  "Flask-Caching: CACHE_TYPE is set to null, "
No PIL installation found
INFO:superset.utils.screenshots:No PIL installation found
Recognized Database Authentications.
Admin User admin created.

Superset initialization

(superset) wangting@ops04:/opt/software >superset init
logging was configured successfully
INFO:superset.utils.logging_configurator:logging was configured successfully
...
...
INFO:superset.security.manager:Creating missing metrics permissions
Cleaning faulty perms
INFO:superset.security.manager:Cleaning faulty perms
(superset) wangting@ops04:/opt/software >

Install gunicorn to provide http services

(superset) wangting@ops04:/opt/software >pip install gunicorn -i https://pypi.douban.com/simple/
Looking in indexes: https://pypi.douban.com/simple/
Requirement already satisfied: gunicorn in /opt/module/miniconda3/envs/superset/lib/python3.6/site-packages (20.0.4)
Requirement already satisfied: setuptools>=3.0 in /opt/module/miniconda3/envs/superset/lib/python3.6/site-packages (from gunicorn) (57.0.0)

Start Supterset

(superset) wangting@ops04:/opt/software >gunicorn --workers 5 --timeout 120 --bind ops04:8787  "superset.app:create_app()" --daemon
(superset) wangting@ops04:/opt/software >

[Note:] ops04 is the host name, and there is ip resolution of the host name in / etc/hosts;

View superset running status

![002](C:\Users\33450\Desktop\Big data document\superset\002.png)(superset) wangting@ops04:/opt/software >netstat -tnlpu|grep 8787
(Not all processes could be identified, non-owned process info
 will not be shown, you would have to be root to see it all.)
tcp        0      0 11.8.38.86:8787         0.0.0.0:*               LISTEN      18884/python        
(superset) wangting@ops04:/opt/software >ps -ef | grep 8787 | grep -v grep 
wangting  18884      1  0 11:32 ?        00:00:00 /opt/module/miniconda3/envs/superset/bin/python /opt/module/miniconda3/envs/superset/bin/gunicorn --workers 5 --timeout 120 --bind ops04:8787 superset.app:create_app() --daemon
wangting  18887  18884  5 11:32 ?        00:00:04 /opt/module/miniconda3/envs/superset/bin/python /opt/module/miniconda3/envs/superset/bin/gunicorn --workers 5 --timeout 120 --bind ops04:8787 superset.app:create_app() --daemon
wangting  18888  18884  5 11:32 ?        00:00:04 /opt/module/miniconda3/envs/superset/bin/python /opt/module/miniconda3/envs/superset/bin/gunicorn --workers 5 --timeout 120 --bind ops04:8787 superset.app:create_app() --daemon
wangting  18890  18884  5 11:32 ?        00:00:04 /opt/module/miniconda3/envs/superset/bin/python /opt/module/miniconda3/envs/superset/bin/gunicorn --workers 5 --timeout 120 --bind ops04:8787 superset.app:create_app() --daemon
wangting  18892  18884  5 11:32 ?        00:00:04 /opt/module/miniconda3/envs/superset/bin/python /opt/module/miniconda3/envs/superset/bin/gunicorn --workers 5 --timeout 120 --bind ops04:8787 superset.app:create_app() --daemon
wangting  18893  18884  5 11:32 ?        00:00:04 /opt/module/miniconda3/envs/superset/bin/python /opt/module/miniconda3/envs/superset/bin/gunicorn --workers 5 --timeout 120 --bind ops04:8787 superset.app:create_app() --daemon

Stop the superset service (stop it if necessary)

# It is equivalent to kill ing the corresponding process IDs one by one. The service itself does not prov id e a command line to stop the service
(superset) wangting@ops04:/opt/software >ps -ef | awk '/gunicorn/ && !/awk/{print $2}' | xargs kill -9

superset access usage

http://ops04:8787/login/

[Note:]

  1. The user name and password are the user name and password just defined by create admin

  2. url address http://ops04:8787/login/ It can be accessed because the ip of ops04 is parsed in the C:\Windows\System32\drivers\etc\hosts file. It can also be accessed directly by changing to ip:8787

superset installation data source

superset needs to install different dependencies for different data sources. The following address is the description of the official website
https://superset.apache.org/docs/databases/installing-database-drivers

Common data source pip installation methods and connection formats

DatabasePyPI packageConnection String
Apache Hivepip install pyhivehive://hive@{hostname}:{port}/{database}
Apache Impalapip install impylaimpala://{hostname}:{port}/{database}
Apache Kylinpip install kylinpykylin://<username>:<password>@<hostname>:<port>/<project>?<param1>=<value1>&<param2>=<value2>
Apache Spark SQLpip install pyhivehive://hive@{hostname}:{port}/{database}
Big Querypip install pybigquerybigquery://{project_id}
Elasticsearchpip install elasticsearch-dbapielasticsearch+http://{user}:{password}@{host}:9200/
MySQLpip install mysqlclientmysql://<UserName>:<DBPassword>@<Database Host>/<Database Name>
Oraclepip install cx_Oracleoracle://
PostgreSQLpip install psycopg2postgresql://<UserName>:<DBPassword>@<Database Host>/<Database Name>
Prestopip install pyhivepresto://
SQLitesqlite://
SQL Serverpip install pymssqlmssql://

Installing mysqlclient dependencies

(superset) wangting@ops04:/opt/software >conda install mysqlclient
Proceed ([y]/n)? 			# y
Preparing transaction: done
Verifying transaction: done
Executing transaction: done

Restart Superset after installation

(superset) wangting@ops04:/opt/software >ps -ef | awk '/gunicorn/ && !/awk/{print $2}' | xargs kill -9
(superset) wangting@ops04:/opt/software >netstat -tnlpu|grep 8787
(Not all processes could be identified, non-owned process info
 will not be shown, you would have to be root to see it all.)
(superset) wangting@ops04:/opt/software >gunicorn --workers 5 --timeout 120 --bind ops04:8787  "superset.app:create_app()" --daemon
(superset) wangting@ops04:/opt/software >netstat -tnlpu|grep 8787
(Not all processes could be identified, non-owned process info
 will not be shown, you would have to be root to see it all.)
tcp        0      0 11.8.38.86:8787         0.0.0.0:*               LISTEN      44883/python        
(superset) wangting@ops04:/opt/software >

Data source configuration

Database configuration

Add database and save

Data - databases

gmall # indicates the name of the user database, which can be modified as appropriate

mysql://root:123456@11.8.38.86/gmall?charset=utf8

root # username

123456 # database password

11.8.38.86 # database ip

gmall # database

charset=utf8 # specifies the character set

Default 3306 port

Add test case data table

table: supersetwt

Table configuration

Data - Datasets + sign added

After adding libraries and tables, it is equivalent to having data source collection

Make dashboard

Add Dashboards + sign

To create a chart, click the table to select a template

Edit model

Click SAVE to SAVE and go to the Kanban to see the effect of the first edition

The picture effect is not obvious. The weight data is too close. Modify the mysql data to make the weight drop larger

Click a small menu in the upper right corner of the icon to refresh the data graph

Continue adding data template elements (reading Statistics)

Continue to add data template elements (maximum body temperature in recent week)

dashboard can edit typesetting

Logo

学AI,认准AI Studio!GPU算力,限时免费领,邀请好友解锁更多惊喜福利 >>>

更多推荐