FiftyOne Environments¶
This guide describes best practices for using FiftyOne with data stored in various environments, including local machines, remote servers, and cloud storage.
Terminology¶
Local machine: Data is stored on the same computer that will be used to launch the App
Remote machine: Data is stored on disk on a separate machine (typically a remote server) from the one that will be used to launch the App
Notebooks: You are working from a Jupyter Notebook or a Google Colab Notebook.
Cloud storage: Data is stored in a cloud bucket (e.g., S3, GCS, or Azure)
Local data¶
When working with data that is stored on disk on a machine with a display, you can directly load a dataset and then launch the App:
1 2 3 4 5 6 | # On local machine
import fiftyone as fo
dataset = fo.Dataset(name="my_dataset")
session = fo.launch_app(dataset) # (optional) port=XXXX
|
From here, you can explore the dataset interactively from the App and from your
Python shell by manipulating the
session object
.
Note
You can use custom ports when launching the App in order to operate multiple App instances simultaneously on your machine.
Remote data¶
FiftyOne supports working with data that is stored on a remote machine that you
have ssh
access to. The basic workflow is to load a dataset on the remote
machine via the FiftyOne Python library, launch a
remote session, and connect to the session on your
local machine where you can then interact with the App.
First, ssh
into your remote machine and
install FiftyOne if necessary.
Then load a dataset using Python on the remote machine and launch a remote session:
1 2 3 4 5 6 | # On remote machine
import fiftyone as fo
dataset = fo.Dataset(name="my_dataset")
session = fo.launch_app(dataset, remote=True) # (optional) port=XXXX
|
Leave this session running, and note that instructions for connecting to this remote session were printed to your terminal (these are described below).
If you do not have fiftyone
installed on your local machine, and do not want
to install it, you can set up port forwarding manually, and view the App in
your browser.
# `[<username>@]<hostname>` refers to your remote machine
ssh -N -L 5151:127.0.0.1:%d [<username>@]<hostname>
If you have fiftyone
installed on the local machine, you can
use the CLI to automatically configure port
forwarding and open the App in either the desktop App or your web browser.
In a local terminal, run the command:
# On local machine
fiftyone app connect --destination <user>@<remote-ip-address> --port 5151
The above instructions assume that you used the default port 5151
when
launching the remote session on the remote machine. If you used a custom port,
(or if you customized your default port via the default_app_port
parameter
of your FiftyOne config), then substitute the
appropriate value in the local commands too.
Note
If you use ssh keys to connect to your remote machine, you can use the
optional --ssh-key
argument of the
fiftyone app connect command.
However, if you are using this key regularly,
it is recommended to add it
to your ~/.ssh/config
as the default IdentityFile
.
Note
You can use custom ports when launching remote sessions in order to serve multiple remote sessions simultaneously.
Notebooks¶
FiftyOne officialy supports Jupyter Notebooks and Google Colab Notebooks.
To use FiftyOne in a notebook, simply install fiftyone
via pip
:
1 | !pip install fiftyone
|
and load datasets as usual. When you run
launch_app()
in a notebook, an App
window will be opened in the output of your current cell.
1 2 3 4 5 6 | import fiftyone as fo
dataset = fo.Dataset(name="my_dataset")
# Creates a session and opens the App in the output of the cell
session = fo.launch_app(dataset)
|
Any time you update the state of your session
object; e.g., by setting
session.dataset
or
session.view
, a new App window
will be automatically opened in the output of the current cell. The previously
active App will be replaced with a screenshot of itself.
An App that was replaced with a screenshot can be reactivated by clicking on
the screenshot if within the notebooj environment in which it was created. Note
that the reactivated App will load the current state of the session
object,
not the state in which the screenshot was taken.
1 2 | # A new App window will be created in the output of this cell
session.view = dataset.take(10)
|
A screenshot of the active App can be taken with
session.freeze()
. This is
useful when you are finished with your notebook and ready to share it with
others.
1 2 3 | # Ensure only screenshots of FiftyOne Apps exist, so the notebook can be
# shared
session.freeze()
|
Manually controlling App instances¶
If you would like to manually control when new App instances are created in a
notebook, you can pass the auto=False
flag to
launch_app()
:
1 2 | # Creates a session but does not open an App instance
session = fo.launch_app(dataset, auto=False)
|
When auto=False
is provided, a new App window is created only when you call
session.show()
:
1 2 3 4 5 6 7 | # Update the session's view; no App windows is created
session.view = dataset.take(10)
# In another cell
# Now open an App window in the cell's output
session.show()
|
As usual, this App window will remain connected to your session
object, so
it will stay in-sync with your session whenever it is active.
Note
If you run session.show()
in
multiple cells, only the most recently created App window will be active,
i.e., synced with the session
object.
You can reactivate an older cell by clicking the link in the deactivated App window, or by running the cell again. This will deactivate the previously active cell.
Opening the App in a dedicated tab¶
If you are working from a Jupyter notebook, you can open the App in a separate browser tab rather than working with it in cell output(s).
To do this, pass the auto=False
flag to
launch_app()
when you launch the
App and then call
session.open_tab()
:
1 2 3 | # Launch the App in a dedicated browser tab
session = fo.launch_app(dataset, auto=False)
session.open_tab()
|
Using the desktop App¶
If you are working from a Jupyter notebook on a machine with the FiftyOne Desktop App installed, you can optionally open the desktop App rather than working with the App in cell output(s).
To do this, pass the desktop=True
flag to
launch_app()
:
1 2 | # Creates a session and launches the desktop App
session = fo.launch_app(dataset, desktop=True)
|
Cloud storage¶
FiftyOne does not yet support accessing data directly in a cloud bucket. Instead, the best practice that we recommend is to mount the cloud bucket as a local drive on a cloud compute instance.
The following sections describe how to do this in the AWS, Google Cloud, and Miscrosoft Azure cloud environments.
Amazon Web Services¶
If your data is stored in an AWS S3 bucket, we recommend mounting the bucket as a local drive on an EC2 instance and then accessing the data using the standard workflow for remote data.
The steps below outline the process.
Step 1
Create an EC2 instance. We recommend a Linux instance.
Step 2
Now ssh into the instance and install FiftyOne if necessary.
# On remote machine
pip install fiftyone
Note
You may need to install some system packages on your compute instance instance in order to run FiftyOne.
Step 3
Mount the S3 bucket as a local drive.
We recommend using s3fs-fuse for
this. You will need to make a .passwd-s3fs
file that contains your AWS
credentials as outlined in the
s3fs-fuse README.
# On remote machine
s3fs <bucket-name> /path/to/mount/point \
-o passwd_file=.passwd-s3fs \
-o umask=0007,uid=<your-user-id>
Step 4
Now that you can access your data from the compute instance, start up Python and create a FiftyOne dataset whose filepaths are in the mount point you specified above. Then launch the App as a remote session:
1 2 3 4 5 6 | # On remote machine
import fiftyone as fo
dataset = fo.Dataset(name="my_dataset")
session = fo.launch_app(dataset, remote=True) # (optional) port=XXXX
|
Step 5
Finally, on your local machine, connect to the remote session that you started on the cloud instance.
# On local machine
fiftyone app connect --destination <user>@<remote-ip-address> --port 5151
The above instructions assume that you used the default port 5151
when
launching the remote session on the remote machine. If you used a custom port,
(or if you customized your default port via the default_app_port
parameter
of your FiftyOne config), then substitute the
appropriate value in the local commands too.
Note
If you use ssh keys to connect to your remote machine, you can use the
optional --ssh-key
argument of the
fiftyone app connect command.
However, if you are using this key regularly,
it is recommended to add it
to your ~/.ssh/config
as the default IdentityFile
.
Note
You can use custom ports when launching remote sessions in order to serve multiple remote sessions simultaneously.
Google Cloud¶
If your data is stored in a Google Cloud storage bucket, we recommend mounting the bucket as a local drive on a GC compute instance and then accessing the data using the standard workflow for remote data.
The steps below outline the process.
Step 1
Create a GC compute instance. We recommend a Linux instance.
Step 2
Now ssh into the instance and install FiftyOne if necessary.
# On remote machine
pip install fiftyone
Note
You may need to install some system packages on your compute instance instance in order to run FiftyOne.
Step 3
Mount the GCS bucket as a local drive.
We recommend using gcsfuse to do this:
# On remote machine
gcsfuse my-bucket /path/to/mount --implicit-dirs
Step 4
Now that you can access your data from the compute instance, start up Python and create a FiftyOne dataset whose filepaths are in the mount point you specified above. Then launch the App as a remote session:
1 2 3 4 5 6 | # On remote machine
import fiftyone as fo
dataset = fo.Dataset(name="my_dataset")
session = fo.launch_app(dataset, remote=True) # (optional) port=XXXX
|
Step 5
Finally, on your local machine, connect to the remote session that you started on the cloud instance.
# On local machine
fiftyone app connect --destination <user>@<remote-ip-address> --port 5151
The above instructions assume that you used the default port 5151
when
launching the remote session on the remote machine. If you used a custom port,
(or if you customized your default port via the default_app_port
parameter
of your FiftyOne config), then substitute the
appropriate value in the local commands too.
Note
If you use ssh keys to connect to your remote machine, you can use the
optional --ssh-key
argument of the
fiftyone app connect command.
However, if you are using this key regularly,
it is recommended to add it
to your ~/.ssh/config
as the default IdentityFile
.
Note
You can use custom ports when launching remote sessions in order to serve multiple remote sessions simultaneously.
Microsoft Azure¶
If your data is stored in an Azure storage bucket, we recommend mounting the bucket as a local drive on an Azure compute instance and then accessing the data using the standard workflow for remote data.
The steps below outline the process.
Step 1
Create an Azure compute instance. We recommend a Linux instance.
Step 2
Now ssh into the instance and install FiftyOne if necessary.
# On remote machine
pip install fiftyone
Note
You may need to install some system packages on your compute instance instance in order to run FiftyOne.
Step 3
Mount the Azure storage container in the instance.
This is fairly straight forward if your data is stored in a blob container. We recommend using blobfuse for this.
Step 4
Now that you can access your data from the compute instance, start up Python and create a FiftyOne dataset whose filepaths are in the mount point you specified above. Then launch the App as a remote session:
1 2 3 4 5 6 | # On remote machine
import fiftyone as fo
dataset = fo.Dataset(name="my_dataset")
session = fo.launch_app(dataset, remote=True) # (optional) port=XXXX
|
Step 5
Finally, on your local machine, connect to the remote session that you started on the cloud instance.
# On local machine
fiftyone app connect --destination <user>@<remote-ip-address> --port 5151
The above instructions assume that you used the default port 5151
when
launching the remote session on the remote machine. If you used a custom port,
(or if you customized your default port via the default_app_port
parameter
of your FiftyOne config), then substitute the
appropriate value in the local commands too.
Note
If you use ssh keys to connect to your remote machine, you can use the
optional --ssh-key
argument of the
fiftyone app connect command.
However, if you are using this key regularly,
it is recommended to add it
to your ~/.ssh/config
as the default IdentityFile
.
Note
You can use custom ports when launching remote sessions in order to serve multiple remote sessions simultaneously.
Setting up a cloud instance¶
When you create a fresh cloud compute instance, you may need to install some system packages in order to install and use FiftyOne.
For example, the script below shows a set of commands that may be used to configure a Debian-like Linux instance, after which you should be able to successfully install FiftyOne.
# Example setup script for a Debian-like virtual machine
# System packages
sudo apt update
sudo apt -y upgrade
sudo apt install -y build-essential
sudo apt install -y unzip
sudo apt install -y cmake
sudo apt install -y cmake-data
sudo apt install -y pkg-config
sudo apt install -y libsm6
sudo apt install -y libxext6
sudo apt install -y libssl-dev
sudo apt install -y libffi-dev
sudo apt install -y libxml2-dev
sudo apt install -y libxslt1-dev
sudo apt install -y zlib1g-dev
sudo apt install -y python3
sudo apt install -y python-dev
sudo apt install -y python3-dev
sudo apt install -y python3-pip
sudo apt install -y python3-venv
sudo apt install -y ffmpeg # if working with video
# (Recommended) Create a virtual environment
python3 -m venv fiftyone-env
. fiftyone-env/bin/activate
# Python packages
pip install --upgrade pip setuptools wheel
pip install ipython