Category: Blog

DatabaseManager

A simple class to manage IndexedDB operations.

Features

Initialize the IndexedDB.
Set, get, delete, and list key-value pairs.
Bulk operations for setting multiple entries.
Retrieve all entries or clear the store.

Example Usage

1. Basic Initialization

NOTE: If using in an HTML file, just simply include the <script src=””> tag with the appropriate location of the file

const DatabaseManager = require('./DatabaseManager'); //

const dbManager = new DatabaseManager('myDatabase', 1, 'myStore', 'id');

dbManager.init().then(() => {
    console.log('Database initialized and ready to use');
}).catch(err => {
    console.error('Error initializing the database:', err);
});

2. Set a value in the store

dbManager.set('username', 'johnDoe').then(() => {
    console.log('Value set successfully');

    // Get the value from the store
    return dbManager.get('username');
}).then(value => {
    console.log('Retrieved value:', value);
}).catch(err => {
    console.error('Error setting or getting the value:', err);
});

3. Delete a value in the store

dbManager.delete('username').then(() => {
    console.log('Value deleted successfully');
}).catch(err => {
    console.error('Error deleting the value:', err);
});

4. List all keys in the store

dbManager.list().then(keys => {
    console.log('Keys in the store:', keys);
}).catch(err => {
    console.error('Error listing keys:', err);
});

5. Retrieve all entries from the store

dbManager.getAll().then(entries => {
    console.log('All entries:', entries);
}).catch(err => {
    console.error('Error retrieving all entries:', err);
});

6. Clear all entries from the store

dbManager.clear().then(() => {
    console.log('Store cleared successfully');
}).catch(err => {
    console.error('Error clearing the store:', err);
});

7. Bulk Setting values in the store

dbManager.setAll([
    { key: 'name', value: 'Alice' },
    { key: 'age', value: 30 },
    { key: 'job', value: 'Developer' }
]).then(() => {
    console.log('All values set successfully');
}).catch(err => {
    console.error('Error setting multiple values:', err);
});

Attention, I’m Trying to Speak: End-to-end speech synthesis (CS224n ’18)

Implementation of a convolutional seq2seq-based text-to-speech model based on Tachibana et. al. (2017). Given a sequence of characters, the model predicts a sequence of spectrogram frames in two stages (Text2Mel and SSRN).

As discussed in the report, we can get fairly decent audio quality with Text2Mel trained for 60k steps, SSRN for 100k steps. This corresponds to about (6+12) hours of training on a single Tesla K80 GPU on the LJ Speech Dataset.

Pretrained Model: [download] Samples: [base-model-M4] [unsupervised-decoder-M1]

For more details see: Poster Paper

Usage:

Directory Structure

 - runs (contains checkpoints and params.json file for each different run. params.json specifies various hyperameters: see params-examples folder)
    - run1/params.json ...
 - src (implementation code package)
 - sentences (contains test sentences in .txt files)
 
train.py
evaluate.py
synthesize.py

../data (directory containing data in format below)
 - FOLDER
    - train.csv, val.csv (files containing [wav_file_name|transcript|normalized_trascript] as in LJ-Speech dataset)
    - wavs (folder containing corresponding .wav audio files)

Script files

Run each file with python <script_file>.py -h to see usage details.

python train.py <PATH_PARAMS.JSON> <MODE>
python evaluate.py <PATH_PARAMS.JSON> <MODE> 
python synthesize.py <TEXT2MEL_PARAMS> <SSRN_PARAMS> <SENTENCES.txt> (<N_ITER> <SAMPLE_DIR>)

Notebooks:

Evaluation: Runs model predictions across the entire training and validation sets for different saved model checkpoints and saves the final results.
Demo: Interactively type input sentences and listen to the generated output audio.

Further:

Training on different languages with smaller amount of data available Dataset of Indian languages
Exploring use of semi-supervised methods to accelerate training, using a pre-trained ‘audio-language model’ as initialization

Referenced External Code:

(From src/init.py) Utility Code has been referenced from the following sources, all other code is the author’s own:

src/data_load.py, dsp_utils.py (with modifications)
https://www.github.com/kyubyong/dc_tts, (Author: kyubyong park, @Kyubyong) https://github.com/r9y9/deepvoice3_pytorch/blob/master/audio.py (Author: @r9y9)
src/spsi.py (referenced)
https://github.com/lonce/SPSI_Python (Author: @lonce)
src/utils.py (referenced)
https://github.com/cs230-stanford/cs230-code-examples https://www.github.com/kyubyong/dc_tts https://github.com/tensorflow/tensor2tensor/blob/master/tensor2tensor/layers/common_attention.py

taro-plugin-vue

A customized @vitejs/plugin-vue for building component libs for Taro.

构建及打包 Taro 第三方 Vue 3.0 组件的定制版 @vitejs/plugin-vue

背景

写 taro-ui-vue3 时，其实已经踩过了 Taro Vue 第三方组件一些出现率比较频繁的坑，其中一个就是在 Taro 项目（h5 或小程序）中使用第三方组件时，发现无法解析某个组件，例如：[Vue-warn]: Failed to resolve component: swiper。

导致这一问题的原因，通常是编译配置的问题。Taro Vue 第三方组件库（基于 SFC template）的编译，应采用与 @tarojs/mini-runner 和 @tarojs/webpack-runner 的 vue-loader 编译配置相同的配置。

为了避免 Taro Vue 第三方组件生态圈重复踩坑，现将 taro-ui-vue3 feat/sfc 分支采用的编译配置提炼出来，方便 Taro Vue 第三方组件库开发者使用。

taro-plugin-vue 其实是基于 @vitejs/plugin-vue 的一个 vite 插件，针对 Taro Vue 第三方组件的 SFC 模板编译进行配置，仅适用于采用 vite 构建和打包的场景。

如果你熟悉 Taro 的 vue-loader 编译配置，亦可直接将相关配置作为参数传递给 @vitejs/plugin-vue 插件即可，无需使用 taro-plugin-vue 插件。

安装

yarn add -D taro-plugin-vue @vitejs/plugin-vue

使用

taro-plugin-vue 在 @vitejs/plugin-vue 的参数 Option 的基础上新增了一个 h5?: boolean 项，用于控制编译的平台。用法和其他参数与 @vitejs/plugin-vue 相同。

// vite.config.js
const { vuePlugin } = require('taro-plugin-vue')

export default {
  plugins: [
    // 编译至小程序平台
    vuePlugin() 

    // 编译至 h5 平台
    vuePlugin({ h5: true }) 

    // 自行配置编译参数，覆盖默认的编译配置
    vue({
      template: {
        transformAssetUrls: {
          video: ['src', 'poster'],
          'live-player': ['src'],
          // ...
        },
        compilerOptions: {
          isNativeTag: ...,
          nodeTransforms: [...]
        }
      }
    })
  ],
  //...
}

默认编译配置

h5

const options: Options = {
  template: {
    ssr: false,
    transformAssetUrls: transformH5AssetUrls,
    compilerOptions: {
      mode: "module",
      optimizeImports: true,
      nodeTransforms: [transformH5Tags()] // 详见 src/transforms.ts
    }
  }
}

小程序

// mini-apps
const options: Options = {
  template: {
    ssr: false,
    transformAssetUrls: transformMiniappAssetUrls,
    compilerOptions: {
      mode: "module",
      optimizeImports: true,
      isNativeTag: isMiniappNativeTag // 详见 src/transforms.ts
    }
  }
}

其他用法

本 repo 还导出了专门针对 Taro Vue 3.0 SFC 模板编译的一些 transform 函数，详情如下。

这些函数可用于 @vitejs/plugin-vue 插件配置, 亦可用于 vue-loader 配置。

/**
 * Transform mini-app asset urls.
 * @see https://github.com/NervJS/taro/blob/next/packages/taro-mini-runner/src/webpack/vue3.ts#L43-L50
 */
export declare const transformMiniappAssetUrls

/**
 * Transform H5 asset urls.
 * @see https://github.com/NervJS/taro/blob/next/packages/taro-webpack-runner/src/config/vue3.ts#L49-L62
 */
export declare const transformH5AssetUrls

/**
 * Declare native mini-app tags, so that miniapp native components
 * such as `picker`, `swiper`, `scroll-view` and etc.
 * will be treated as native tags, thus not to be resolved as components.
 */
export declare function isMiniappNativeTag(tag: string): boolean;

/**
 * Transform tags for h5 components.
 * For example, tag `view` will be transformed to `taro-view`,
 * so that it will be compiled to `resolveComponent('taro-view')`.
 */
export declare function transformH5Tags(): NodeTransform;

/**
 * Transform `taro-env` or `taroEnv` prop,
 * and remove node that is not for the specified platform
 * @param platform `'mini' | 'h5'`
 */
export declare function transformEnv(platform?: 'mini' | 'h5'): NodeTransform;

/**
 * Transform `onClick` to `onTap` on native tags.
 */
export declare const transformClick: NodeTransform;

Jupyter Notebook stays on macOS menubar.

Download

See releases.

Supported platforms

macOS

Requirements

Jupyter Notebook

How to install Jupyter Notebook

brew install python3
pip3 install jupyter
jupyter notebook

Config

Juno config is located on ~/.junorc.json.

default parameters are:

{
  "jupyterCommand": "/usr/local/bin/jupyter-notebook", // executable path for Jupyter Notebook
  "jupyterPort": 8888, // server port
  "jupyterHome": "~", // root folder
  "openBrowserOnStartup": true, // set true if let Juno open browser after launch
  "preferLab": false // open Jupyter Lab instead of Jupyter Notebook
}

JupyterLab

You can also specify /usr/local/bin/jupyter-lab to jupyterCommand to utilize Jupyter Lab (you may also want to install jupyterlab via pip3 install jupyterlab.)

pyenv

Put ~/.pyenv/shims/jupyter into jupyterCommand if you are on pyenv-enabled environment.

Launch Juno from Terminal

Add juno command to open Jupyter notebooks from Terminal. Put following code to your shell config file.

juno() {
  open -a Juno $1
}

to open a notebook:

juno "Untitled.ipynb"

Bugs

Feel free to report issues.

Roadmap

Launch Juno in specified directory
Terminal integration
Test suite
Auto update

Screenshots

Development Installation

npm install
npm start

Test & Build

npm test
npm run build

License

MIT © Yasuaki Uechi

England School Admissions

An interactive dashboard to help parents in England understand the primary and secondary school admissions process.
Explore the docs »
View Dashboard · Project Wiki · Report Bug · Request Feature

Table of Contents

About The Project
- Built With
- Steps Taken
Roadmap
Contributing
License
Contact
Acknowledgments

About The Project

Applying for a place for your child at primary or secondary school in England can be a daunting task. In some regions, parents have a plethora of schools to choose from. Having options is no bad thing, but in order to make an informed choice parents need answers to questions such as:

How many preferences can I list on my application form?
What is the likelihood of getting an offer at my preferred school based on my circumstances?
Which schools are oversubscribed?
Which schools are the top performers in my area?

This dashboard sets out to help parents answer those questions and more in order to better understand the landscape before submitting their application. The dashboard uses data published by UK Government on school admissions and performance at both primary and secondary level, and covers all local authorities in England.

(back to top)

Built With

(back to top)

Steps Taken

Downloaded required data.
Cleaned and modelled the data using R Tidyverse.
Built the dashboard using Tableau Public.

The project wiki contains more information about how this repo is structured and how the data was collected.

(back to top)

Roadmap

Show breakdown of admissions by criteria at a school level
Improve the aesthetics of the dashboard by using more sophisticated design elements such as custom backgrounds
Get feedback from parents who have used the dashboard

See the open issues for a full list of proposed features (and known issues).

(back to top)

Contributing

Contributions are what make the open source community such an amazing place to learn, inspire, and create. Any contributions you make are greatly appreciated.

If you have a suggestion that would make this better, please fork the repo and create a pull request. You can also simply open an issue with the tag “enhancement”. Don’t forget to give the project a star! Thanks again!

Fork the Project
Create your Feature Branch (git checkout -b feature/AmazingFeature)
Commit your Changes (git commit -m 'Add some AmazingFeature')
Push to the Branch (git push origin feature/AmazingFeature)
Open a Pull Request

(back to top)

License

Distributed under the CC0 License. See LICENSE.txt for more information. All source data is licensed under the Open Government License 3.0.

(back to top)

Contact

Clare Gibson – @surreydatagirl – clarelgibson@gmail.com

Project Link: https://github.com/clarelgibson/england-school-admissions

(back to top)

Acknowledgments

School stickers created by Stickers – Flaticon
School icons created by iconmas – Flaticon
Dashboard design inspired by the School Safety Dashboard created by Safe Schools for Alex

(back to top)

Table of content

Introduction
Natural Language Inference (NLI) task
The data (SNLI dataset)
Command lines (How to use this git)
- Pytorch lightning script

Introduction

Natural Language Inference (NLI) task

The data (SNLI dataset)

Command lines (How to use this git)

First of all make sure to use the environnement.

Virtualenv – pip environment (recommended)

Path to $VENV should be saved in ~/.bashrc

# Specify path to venv
export VENV=path/to/venv
echo $VENV

# Create venv
python -m venv $VENV/bert

# Activate venv
source $VENV/bert/bin/activate

# Replicate on cpu
pip install -r python_env/requirements.cpu.txt --no-cache-dir

# Replicate on gpu
pip install -r python_env/requirements.gpu.txt --no-cache-dir

# Exit venv
deactivate

Virtualenv – conda environment

if you are using conda you can use the two following command :

conda env create -f python_env/environment.yml
conda activate nlp

conda create --name nlp --file requirements.txt
conda activate nlp

WARNING: All the environments were exported on windows 11 -64 bits.

Download the data

To download the snli and e-snli data the command line is the following :

python data_download.py

All the data downloaded in this part will be stored in the folder : .cache\raw_data

Pytorch lightning training script

To run the training_bert.py for some tests we used the following command line :

python training_bert.py --epoch 3 --batch_size 4 --nb_data 16 --experiment bert --version 0

# Or by shorthand
python training_bert.py -e 3 -b 4 -n 16 --experiment bert --version 0

The objective was only to see the behaviour of the training with a small amount of data. (Spot some mistakes and see the
behaviour of the loss)

To visualize our training performance we used the tool tensorboard. The default logdir in
in .cache/logs/$EXPERIMENT
where $EXPERIMENT is specified in --experiment. The log could be changed using flag --logdir or shorthand -s

tensorboard --logdir .cache/logs/$EXPERIMENT

pybrainlife

This repository contains the python package for collecting, collating, manipulating, analyzing, and visualizing MRI data generated on brainlife.io. Designed to used within the brainlife.io Analysis tab Jupyter notebooks, can be installed as a pypi package to your local machine.

Authors

Brad Caron (bacaron@utexas.edu)

Contributors

Anibal Heinsfeld (anibalsolon@utexas.edu)
Soichi Hayashi (hayashi@utexas.edu)
Franco Pestilli (pestilli@utexas.edu)

Funding

Citations

Please cite the following articles when publishing papers that used data, code or other resources created by the brainlife.io community.

Hayashi, S., Caron, B., et al. In review

Directory structure

pybrainlife
├── dist
│   ├── pybrainlife-1.0.0-py3-none-any.whl
│   └── pybrainlife-1.0.0.tar.gz
├── poetry.lock
├── pybrainlife
│   ├── data
│   │   ├── collect.py
│   │   └── manipulate.py
│   ├── __init__.py
│   └── vis
│       ├── plots.py
│       └── __pycache__
│           ├── data.cpython-38.pyc
│           └── plots.cpython-38.pyc
├── pyproject.toml
├── README.md
└── tests
    ├── __init__.py
    └── test_pybrainlife.py

Installing locally

This package can be installed locally via PyPi using the following command:

pip install pybrainlife

Dependencies

This package requires the following libraries.

python = “3.8”
numpy = “^1.9.3”
bctpy = “^0.5.2”
seaborn = “^0.11.2”
jgf = “^0.2.2”
scikit-learn = “^1.0.2”
pandas = “^1.4.2”
scipy = “^1.8.0”
requests = “^2.27.1”

Library of Modules for Loading Data and Analyzing Data from brainlife.io

2023 The University of Texas at Austin

indelope: find indels and SVs too small for structural variant callers and too large for GATK

indelope was started with the goal of increasing the diagnostic rate in exomes. To do this it must be:

fast : ~2.5 CPU-minutes per exome (25% slower than samtools view -c)
easy-to-use : goes from BAM to VCF in a single command.
novel : it does local assembly and then aligns assembled contigs to the genome to determine the event, and then does k-mer counting (not alignment) to genotype without k-mer tables.
accurate : because of the genotyping method, we know that called variants are not present in the reference.

These features will help ensure that it is actually used (fast, easy-to-use) and that it finds new and valid variation.

As of November 2017, indelope is working — it finds large indels that are clearly valid by visual inspection that are missed by GATK/freebayes/lumpy.

As of November 2017, I am still tuning. Here is a look at the progress:

Note that while indelope is steadily improving, it still is not as good as scalpel. More improvements are coming soon.

indelope also works on whole genomes, but, for now, that is not the target use-case.

how it works

indelope sweeps over a single bam and finds regions that are likely to harbor indels–reads that have more than 1 cigar event and split-reads (work on split reads is in progress). As it finds these it increments a counter for the genomic position of the event. Upon finding a gap in coverage, it goes back, finds any previous position with sufficient evidence (this is a parameter) of an event, gathers reads that have been aligned across that genomic position (and unaligned reads from that region) and does assembly on those reads. It then aligns the assembled contigs to the genome using ksw2 and uses the CIGAR to determine the event as it’s represented in the VCF. Any event will result in a novel k-mer not present in the reference genome; indelope gets the k-mer of the reference genome at the event and the novel k-mer of the alternate event. It again iterates through the reads that were originally aligned to the event and counts reference and alternate k-mers. Those counts are used for genotyping. Note that this reduces reference bias because we are aligning a contig (often >400 bases) to the genome and never re-aligning the actual reads.

As indelope sweeps across the genome, it keeps the reads for each chunk in memory. A chunk bound is defined by a gap in coverage; this occurs frequently enough that the memory use is negligible. Once a new chunk is reached, all events from the previous chunk are called and then those reads are discarded. This method, along with the assembly method make indelope extremely fast–given 2 BAM decompression threads, it can call variants in an exome in ~ 1 minute (2.5 CPU-minutes).

assembly

A read (contig) slides along another read (contig) to find the offset with the most matches. At each offset, if more than $n mismatches are found, the next offset is attempted. This is efficient enough that a random read to a random (non-matching) contig of length $N will incur ~ 1.25 * $N equality (char vs. char) tests.

Within each contig indelope tracks the number of reads supporting each base. Given a sufficient number of reads supporting a contig, it can account for sequencing errors with a simple voting scheme. That is: if contig a, position x has more than 7 supporting reads and contig b has fewer than 3 supporting reads (and we know that otherwise, a and b have no mismatches), we can vote to set the mismatch in b to the apparent match in a. This allows us to first create contigs allowing no mismatches within the reads and then to combine and extend contigs using this voting method.

installation and usage

get a binary from here and make sure that libhts.so is in your LD_LIBRARY_PATH

then run ./indelope -h for usage. recommended is:

indelope --min-event-len 5 --min-reads 5 $fasta $bam > $vcf

to do

somatic mode / filter mode. allow filtering on a set of k-mers from a parental genome (parent for mendelian context or normal for somatic variation).
use SA tag. (and possibly discordant reads)

notes and TODO

need a better way to combine contigs

sometimes, can have 2 contigs, each of length ~ 80 and they overlap for 60 bases but cutoff is e.g. 65. Need a way to recover this as it happens a lot in low-coverage scenarios. maybe it can first combine, then trim (currently, it’s trim, combine). This should also allow more permissive overlaps if the correction list is empty.

track a read/contig matches multiple contigs with the same match, mismatch count

CHM1/13 truth-set

https://www.ncbi.nlm.nih.gov/biosample?Db=biosample&DbFrom=bioproject&Cmd=Link&LinkName=bioproject_biosample&LinkReadableName=BioSample&ordinalpos=1&IdsFromResult=316945

~/.aspera/connect/bin/ascp -P33001 -QT -L- -l 1000M -i ~/.aspera/connect/etc/asperaweb_id_dsa.openssh era-fasp@fasp.sra.ebi.ac.uk:/vol1/ERA596/ERA596286/bam/CHM1_1.bam . ~/.aspera/connect/bin/ascp -P33001 -QT -L- -l 1000M -i ~/.aspera/connect/etc/asperaweb_id_dsa.openssh era-fasp@fasp.sra.ebi.ac.uk:/vol1/ERA596/ERA596286/bam/CHM13_1.bam .

ftp://ftp.ncbi.nlm.nih.gov/pub/dbVar/data/Homo_sapiens/by_study/nstd137_Huddleston_et_al_2016/genotype/CHM1_final_genotypes.annotated.vcf.gz ftp://ftp.ncbi.nlm.nih.gov/pub/dbVar/data/Homo_sapiens/by_study/nstd137_Huddleston_et_al_2016/genotype/CHM13_final_genotypes.annotated.vcf.gz ftp://ftp.ncbi.nlm.nih.gov/pub/dbVar/data/Homo_sapiens/by_study/nstd137_Huddleston_et_al_2016/genotype/

contigs

min_overlap in contig::best_match should be a float between 0 and 1 that will make sure that at least that portion of the shortest contig overlaps the other.

Category: Blog

DatabaseManager

Features

Example Usage

1. Basic Initialization

2. Set a value in the store

3. Delete a value in the store

4. List all keys in the store

5. Retrieve all entries from the store

6. Clear all entries from the store

7. Bulk Setting values in the store

Attention, I’m Trying to Speak: End-to-end speech synthesis (CS224n ’18)

Usage:

Directory Structure

Script files

Notebooks:

Further:

Referenced External Code:

taro-plugin-vue

背景

安装

使用

默认编译配置

其他用法

Download

Supported platforms

Requirements

How to install Jupyter Notebook

Config

JupyterLab

pyenv

Launch Juno from Terminal

Bugs

Roadmap

Screenshots

Development Installation

Test & Build

License

England School Admissions

About The Project

Built With

Steps Taken

Roadmap

Contributing

License

Contact

Acknowledgments

Table of content

Introduction

Natural Language Inference (NLI) task

The data (SNLI dataset)

Command lines (How to use this git)

Virtualenv – pip environment (recommended)

Virtualenv – conda environment

Download the data

Pytorch lightning training script

pybrainlife

Authors

Contributors

Funding

Citations

Directory structure

Installing locally

Dependencies

indelope: find indels and SVs too small for structural variant callers and too large for GATK

how it works

assembly

installation and usage

to do

see also

notes and TODO

need a better way to combine contigs

CHM1/13 truth-set

contigs