Part 1: calculate gradients

There are two ways of getting gradients:

Backward

x=torch.tensor([3.0], requires_grad=True)
y = torch.pow(x, 2) # y=x**2
y.backward(retain_graph=True)
print(x.grad)
6

Grad

x=torch.tensor([3.0], requires_grad=True)
y = torch.pow(x, 2)
grad_1 = torch.autograd.grad(y, x, create_graph=True)
print(grad_1[0].item())

Part 2: Note

(1) gradient will not be cleared unless explicitly cleared

w = torch.tensor([1.], requires_grad=True)
x = torch.tensor([2.], requires_grad=True)

for i in range(4):
a = torch.add(w, x)
b = torch.add(w, 1)
y = torch.mul(a, b)
y.backward()
print(w.grad)
w.grad.zero_()

(2) gradient and gradient function

import torch
w = torch.tensor([1.], requires_grad=True)
x = torch.tensor([2.], requires_grad=True)
# y=(x+w)*(w+1)
a = torch.add(w, x) # retain_grad()
b = torch.add(w, 1)
y = torch.mul(a, b)

y.backward()


Part 1: What is Metaflow?

Data is accessed from a data warehouse, which can be a folder of files, a database, or a multi-petabyte data lake.

Part 2: Notes

  1. Generate process map PNG file
python helloworld.py output-dot | dot -Tpng -o /tmp/graph.png

This command will put the processing chain in a PNG file


Part 1: Preliminary

  • True Positives (TP, blue distribution) are the people that truly have the COVID-19 virus.
  • True Negatives (TN, red distribution) are the people that truly DO NOT have the COVID-19 virus.
  • False Positives (FP) are the people that are truly NOT sick but based on the test, they were falsely (False) denoted as sick (Positives).
  • False Negatives (FN) are the people that are truly sick but based on the test, they were falsely (False) denoted as NOT sick (Negative).

For the perfect case, we would want high values TP and TN and zero FP and FN — this would be the…


Part 1: Hold-out Method

  1. How to separate data?

Now we know that our model has errors and there could be several sources of errors. But, how do we identify which one? We have millions of records in the training set, and at least several thousands in the dev set. The test set is not in sight as yet.

We cannot evaluate every record in the training set. Nor can we evaluate each record in the dev set. In order to identify the kind of errors our model generates, we split the dev set into two parts — the eyeball set and the blackbox set.


Table of Contents

· Part 1: Dataset from torch.utils.data
· Part 2: Dataset from IterableDataset
· Part 3: notes
· Part 4: reference

Part 1: Dataset from torch.utils.data

Before PyTorch 1.2 the only available dataset class was the original “map-style” dataset. This simply requires the user to inherit from the torch.utils.data.Dataset class and implement the __len__ and __getitem__ methods, where __getitem__ receives an index which is mapped to some item in your dataset.

This is from the How to Build a Streaming DataLoader with PyTorch blog, and it well summarize the Dataset PyTorch class.

Part 2: Dataset from IterableDataset

IterableDataset is particularly suitable for stream file, where it is difficult to read everything…


Table of Contents

· Part 1: What is Tensor?
· Part 2: Tensor generator
· Part 3: Tensor property
· 3.1 Modify itself
· 3.2 Internal property

Part 1: What is Tensor?

A PyTorch tensor is nearly the same thing as a numpy array, but with an additional restriction which unlocks some additional capabilities. It’s the same in that it, too, is a multidimensional table of data, with all items of the same type. However, the restriction is that a tensor has to use a single basic numeric type for all components. As a result, a tensor is not as flexible as a genuine array of arrays, which…


Part 1: Introduction to Dash

Every Dash app requires a layout. The layout includes:

  • dash core components dcc.Graph
  • dash html components html.Div, html.H3
app.layout = html.Div(
children=[
html.Div(
className="data-class",
children=[
html.H3("head"),
dcc.Graph(id="my-id"),
],
),
]
)
  • html.Div is more like a wrapper, and it can contain other components like dcc.Graph etc.

Part 2: Notes

(1) if the ouput is html.Div, then its property is children

html.Div(id="my-div")@app.callback(Output("my-div", "children"))
def fun()
return html.Div([dcc.Graph(figure=fig)])

(2) if the output is dcc.Graph, then its property is figure

dcc.Graph(id="my-graph")@app.callback
def fun()
fig={
"data": [go.Bar()],
"layout":{ }
}or
@app.callback
def fun()
from plotly.subplots import make_subplots
fig = make_subplots(rows=2, cols=1, shared_xaxes=True, )…

Part 1: Introduction

  • PyTorch example code

where you will find the following functions that define the hyper-parameters

  1. trial.suggest_int(“n_layers”, 1, 3)
  2. trial.suggest_categorical(“optimizer”, [“Adam”, “RMSprop”])
  3. trial.suggest_float(“lr”, 1e-5, 1e-1, log=True)
  • In Optuna there are three terminologies:
  1. objective: objective function that you want to optimize
  2. trial: a single call of the objective function
  3. study: an optimization session, which is a set of trials
  4. parameters: a variable whose value is to be optimized

Reference


Part 1: Concepts

  • Experiment
import mlflow

# Create an experiment name, which must be unique and case sensitive
experiment_id = mlflow.create_experiment("Social NLP Experiments")
experiment = mlflow.get_experiment(experiment_id)
print("Name: {}".format(experiment.name))
print("Experiment_id: {}".format(experiment.experiment_id))
print("Artifact Location: {}".format(experiment.artifact_location))
print("Tags: {}".format(experiment.tags))
print("Lifecycle_stage: {}".format(experiment.lifecycle_stage))
  • Runs
def print_auto_logged_info(r):
tags = {k: v for k, v in r.data.tags.items() if not k.startswith("mlflow.")}
artifacts = [f.path for f in MlflowClient().list_artifacts(r.info.run_id, "model")]
print("run_id: {}".format(r.info.run_id))
print("artifacts: {}".format(artifacts))
print("params: {}".format(r.data.params))
print("metrics: {}".format(r.data.metrics))
print("tags: {}".format(tags))
mlflow.autolog()
with mlflow.start_run() as run:
neigh = KNeighborsClassifier(n_neighbors=5)
neigh.fit(X,y)

print('active run_id: {}'.format(run.info.run_id))

print_auto_logged_info(mlflow.get_run(run_id=run.info.run_id))…

Part 1: What is Captum?

Captum is a model interpretability library for PyTorch which currently offers a number of attribution algorithms that allow us to understand the importance of input features, and hidden neurons and layers.

Reference

ifeelfree

Get the Medium app

A button that says 'Download on the App Store', and if clicked it will lead you to the iOS App store
A button that says 'Get it on, Google Play', and if clicked it will lead you to the Google Play store