Starting Glue from Python¶
In addition to using Glue as a standalone program, you can import glue as a library from Python. There are (at least) two good reasons to do this:
- You are working with multidimensional data in python, and want to use Glue for quick interactive visualization.
- You find yourself repeatedly loading the same sets of data each time you run Glue. You want to write a startup script to automate this process.
Quickly send data to Glue with qglue¶
The easiest way to send python variables to Glue is to use
qglue()
:
from glue import qglue
For example, say you are working with a Pandas DataFrame:
>>> df
<class 'pandas.core.frame.DataFrame'>
Int64Index: 500 entries, 0 to 499
Data columns (total 3 columns):
x 500 non-null values
y 500 non-null values
z 500 non-null values
dtypes: float64(3)
You can easily start up Glue with this data using:
>>> qglue(xyz=df)
This will send this data to Glue, and label it xyz
.
qglue()
accepts many data types as inputs. Let’s see some examples:
import numpy as np
import pandas as pd
from astropy.table import Table
x = [1, 2, 3]
y = [2, 3, 4]
u = [10, 20, 30, 40]
v = [20, 40, 60, 80]
pandas_data = pd.DataFrame({'x': x, 'y': y})
dict_data = {'u': u, 'v': v}
recarray_data = np.rec.array([(0, 1), (2, 3)],
dtype=[('a', 'i'), ('b', 'i')])
astropy_table = Table({'x': x, 'y': y})
bad_data = {'x': x, 'u':u}
qglue(xy=pandas_data)
:constructs a dataset labeled
xy
, with two components (x
andy
)
qglue(uv=dict_data)
:construct a dataset labeled
uv
, with two components (u
andv
)
qglue(xy=pandas_data, uv=dict_data)
:constructs both of the previous two data sets.
qglue(rec=recarray_data, astro=astropy_table)
:constructs two datasets:
rec
(componentsa
andb
), andastro
(componentsx
andy
)
qglue(bad=bad_data)
:doesn’t work, because the two components
x
andu
have different shapes.
Note
Reminder: in Glue, Data
sets are collections
of one or more Component
objects.
Components in a dataset are bascially arrays of the same shape. For
more information, see Working with Data objects
Note
Datasets cannot be given the label links
.
Linking data with qglue
¶
The Data Linking tutorial discusses how Glue uses the
concept of links to compare different datasets. From the GUI, links
are defined using the Link Manager. It is
also possible to define some of these links with qglue
.
The links
keyword for qglue
accepts a list of link descriptions. Each link description has the following format:
(component_list_a, component_set_b, forward_func, back_func)
component_list_a
andcomponent_list_b
are lists of component names. In the first example above, thex
component in thexyz
dataset is named'xyz.x'
.forward_func
is a function which accepts one or more numpy arrays as input, and returns one or more numpy arrays as output. It computes the quantities incomponent_set_b
, given the quantities incomponent_list_a
.back_func
performs the reverse operastion.
Here’s an example:
def pounds_to_kilos(lbs):
return lbs / 2.2
def kilos_to_pounds(kilos):
return kilos * 2.2
def lengths_to_area(width, height):
return width * height
link1 = (['data1.m_lb'], ['data_2.m_kg'], pounds_to_kilos, kilos_to_pounds)
link2 = (['data1.width', 'data1.height'], ['data2.area'], lengths_to_area)
qglue(data1=data1, data2=data2, links=[link1, link2])
The first link converts between the masses in two different data sets, recorded in different units. The second link is a 1-way link that computes the area of items in dataset 1, based on their width and height (there is no way to compute the width and height from the area measurements in dataset 2, so the reverse function is not provided). These links would enable the following interaction, for example:
- Overplot histograms of the mass distribution of both datasets
- Define a region in a plot of mass vs area for data 2, and apply that filter to dataset 1
Note
If you start Glue from a non-notebook IPython session, you will
encounter an error like Multiple incompatible subclass instances of
IPKernelApp are being created
. The solution to this is to start
Glue from a non-IPython shell, or from the notebook (see next
section).
Using qglue with the IPython Notebook¶
You can call qglue()
from the IPython notebook normally. However, the default behavior is for Glue to block the execution of the
notebook while the UI is running. If you would like to be able to use the notebook and Glue at the same time, run this cell before starting glue:
%gui qt
This must be executed in a separate cell, before starting Glue.
Manual data construction¶
If qglue
is not flexible enough for your needs, you can build data objects
using the general Glue data API described in Working with Data objects.
Here’s a simple script to load data and pass it to Glue:
from glue.core.data_factories import load_data
from glue.core import DataCollection
from glue.core.link_helpers import LinkSame
from glue.app.qt.application import GlueApplication
#load 2 datasets from files
image = load_data('w5.fits')
catalog = load_data('w5_psc.vot')
dc = DataCollection([image, catalog])
# link positional information
dc.add_link(LinkSame(image.id['World x: RA---TAN'], catalog.id['RAJ2000']))
dc.add_link(LinkSame(image.id['World y: DEC--TAN'], catalog.id['DEJ2000']))
#start Glue
app = GlueApplication(dc)
app.start()
Some remarks:
load_data()
constructs Glue Data objects from files. It uses the file extension as a hint for file type- Individual data objects are bundled inside a
DataCollection
- The
LinkSame
function indicates that two attributes in different data sets descirbe the same quantityGlueApplication
takes aDataCollection
as input, and starts the GUI viastart()
Starting Glue from a script¶
If you call glue with a python script as input, Glue will simply run that script:
$ glue startup_script.py
Likewise, if you are using the pre-built Mac application, you can right-click on a script and open the file with Glue.