To count the number of upper case letters in a string,
echo 'ERica' | gawk '{print gsub("[A-Z]", "",$0)}'
Monday, 28 March 2016
Replacement Text Case Conversion in Regular Expression
Replacement Text Case Conversion
For example,
to change the '\2' to the uppercase,
nd=`dirname $f | perl -pe "s|(.+/)([^/]+)/?$|\1\U\2|g"`
For example,
to change the '\2' to the uppercase,
nd=`dirname $f | perl -pe "s|(.+/)([^/]+)/?$|\1\U\2|g"`
Wget (The Non-interactive Network Downloader) Options
--content-disposition
If this is set to on, experimental (not fully-functional) support for "Content-Disposition" headers is enabled. This can currently result in extra round-trips to the server for a "HEAD" request, and is known to suffer from a few bugs, which is why it is not currently enabled by default.
This option is useful for some file-downloading CGI programs that use "Content-Disposition" headers to describe what the name of a downloaded file should be.
--no-check-certificate
Don't check the server certificate against the available certificate authorities. Also don't require the URL host name to
match the common name presented by the certificate.
As of Wget 1.10, the default is to verify the server's certificate against the recognized certificate authorities, breaking the SSL handshake and aborting the download if the verification fails. Although this provides more secure downloads, it does break interoperability with some sites that worked with previous Wget versions, particularly those using self-signed, expired, or otherwise invalid certificates. This option forces an "insecure" mode of operation that turns the certificate verification errors into warnings and allows you to proceed.
If you encounter "certificate verification" errors or ones saying that "common name doesn't match requested host name", you
can use this option to bypass the verification and proceed with the download. Only use this option if you are otherwise convinced of the site's authenticity, or if you really don't care about the validity of its certificate. It is almost always a bad idea not to check the certificates when transmitting confidential or important data.
If this is set to on, experimental (not fully-functional) support for "Content-Disposition" headers is enabled. This can currently result in extra round-trips to the server for a "HEAD" request, and is known to suffer from a few bugs, which is why it is not currently enabled by default.
This option is useful for some file-downloading CGI programs that use "Content-Disposition" headers to describe what the name of a downloaded file should be.
--no-check-certificate
Don't check the server certificate against the available certificate authorities. Also don't require the URL host name to
match the common name presented by the certificate.
As of Wget 1.10, the default is to verify the server's certificate against the recognized certificate authorities, breaking the SSL handshake and aborting the download if the verification fails. Although this provides more secure downloads, it does break interoperability with some sites that worked with previous Wget versions, particularly those using self-signed, expired, or otherwise invalid certificates. This option forces an "insecure" mode of operation that turns the certificate verification errors into warnings and allows you to proceed.
If you encounter "certificate verification" errors or ones saying that "common name doesn't match requested host name", you
can use this option to bypass the verification and proceed with the download. Only use this option if you are otherwise convinced of the site's authenticity, or if you really don't care about the validity of its certificate. It is almost always a bad idea not to check the certificates when transmitting confidential or important data.
Friday, 25 March 2016
R ggplot2 vjust and hjust
What do hjust and vjust do when making a plot using ggplot?
Imagine that the text is bordered within a box.
hjust=0 places the reference position coinciding with the left side of the box. hjust=n (n>0) shifts the box to the left by n*(box width) in relation to the reference position. hjust=n (n<0) shifts the box to the right by n*(box width) from the reference position.
vjust=0 place the reference position coinciding with the bottom side of the box. vjust=n (n>0) shifts the box down in relation to the reference position by n*(box height). vjust=n (n<0) shifts the box up from the reference position by n*(box height).
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
Cited from the book "ggplot2 Elegant Graphics for Data Analysis"
Justification of a string (or legend) defines the location within the string that is placed at the given position. There are two values for horizontal and vertical justification. The values can be:
Imagine that the text is bordered within a box.
hjust=0 places the reference position coinciding with the left side of the box. hjust=n (n>0) shifts the box to the left by n*(box width) in relation to the reference position. hjust=n (n<0) shifts the box to the right by n*(box width) from the reference position.
vjust=0 place the reference position coinciding with the bottom side of the box. vjust=n (n>0) shifts the box down in relation to the reference position by n*(box height). vjust=n (n<0) shifts the box up from the reference position by n*(box height).
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
Cited from the book "ggplot2 Elegant Graphics for Data Analysis"
Justification of a string (or legend) defines the location within the string that is placed at the given position. There are two values for horizontal and vertical justification. The values can be:
- A string: "left", "right", "centre", "center", "bottom", and "top".
- A number between 0 and 1, giving the position within the string (from bottom-left corner).
Thursday, 24 March 2016
Deconvolute R Package UpSetR
Functions located in Helper.funcs.R:
## Finds the columns that represent the sets
FindStartEnd
## Finds the n largest sets if the user hasn't specified any sets
FindMostFreq
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
Functions located in MainBar.R:
Counter
Make_main_bar
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
Functions located in Matrix.R
Create_matrix
Create_layout
MakeShading
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
Functions located in SizeBar.R
FindSetFreqs
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
Functions located in General.query.funcs.R
General.query.funcs.R
## Finds the columns that represent the sets
FindStartEnd
## Finds the n largest sets if the user hasn't specified any sets
FindMostFreq
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
Functions located in MainBar.R:
Counter
Make_main_bar
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
Functions located in Matrix.R
Create_matrix
Create_layout
MakeShading
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
Functions located in SizeBar.R
FindSetFreqs
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
Functions located in General.query.funcs.R
General.query.funcs.R
Wednesday, 23 March 2016
Methylation Sequencing Papers
Whole-Genome Bisulfite Sequencing of Two Distinct Interconvertible DNA Methylomes of Mouse Embryonic Stem Cells
Single-cell genome-wide bisulfite sequencing for assessing epigenetic heterogeneity
An epigenomic roadmap to induced pluripotency reveals DNA methylation as a reprogramming modulator
Active DNA demethylation at enhancers during the vertebrate phylotypic period
Single-cell genome-wide bisulfite sequencing for assessing epigenetic heterogeneity
An epigenomic roadmap to induced pluripotency reveals DNA methylation as a reprogramming modulator
Active DNA demethylation at enhancers during the vertebrate phylotypic period
Monday, 21 March 2016
Perl: the Input Record Separator
Cited from slurp mode - reading a file in one step
The $/ variable is the Input Record Separator in Perl. When we put the read-line operator in scalar context, for example by assigning to a scalar variable $x = <$fh>, perl will read from the file up-to and including the Input Record Separator which is, by default, the new-line \n.
What we did here is we assigned undef to $/. So the read-line operator will read the file up-till the first time it encounters undef in the file. That never happens so it reads till the end of the file. This is what is called slurp mode, because of the sound the file makes when we read it.
The $/ variable is the Input Record Separator in Perl. When we put the read-line operator in scalar context, for example by assigning to a scalar variable $x = <$fh>, perl will read from the file up-to and including the Input Record Separator which is, by default, the new-line \n.
What we did here is we assigned undef to $/. So the read-line operator will read the file up-till the first time it encounters undef in the file. That never happens so it reads till the end of the file. This is what is called slurp mode, because of the sound the file makes when we read it.
Perl: The Difference Between My and Local Variables
Cited from The difference between my and local
'local' temporarily changes the value of the variable, but only within the scope it exists in.
'my' creates a variable that does not appear in the symbol table, and does not exist outside of the scope that it appears in.
$::a refers to $a in the 'global' namespace.
use local when:
'local' temporarily changes the value of the variable, but only within the scope it exists in.
'my' creates a variable that does not appear in the symbol table, and does not exist outside of the scope that it appears in.
$::a refers to $a in the 'global' namespace.
use local when:
- you want to amend a special Perl variable, eg $/ when reading in a file. my $/; throws a compile-time error
Perl Repetition Operator "x"
Cited from How can I repeat a string N times in Perl?
Binary "x" is the repetition operator. In scalar context or if the left operand is not enclosed in parentheses, it returns a string consisting of the left operand repeated the number of times specified by the right operand. In list context, if the left operand is enclosed in parentheses or is a list formed by "qw/STRING/", it repeats the list. If the right operand is zero or negative, it returns an empty string or an empty list, depending on the context.
say ’-’ x 80; # print row of dashes
my @ones = (1) x 80; # a list of 80 1’s
@ones = (5) x @ones; # set all elements to 5
Binary "x" is the repetition operator. In scalar context or if the left operand is not enclosed in parentheses, it returns a string consisting of the left operand repeated the number of times specified by the right operand. In list context, if the left operand is enclosed in parentheses or is a list formed by "qw/STRING/", it repeats the list. If the right operand is zero or negative, it returns an empty string or an empty list, depending on the context.
say ’-’ x 80; # print row of dashes
my @ones = (1) x 80; # a list of 80 1’s
@ones = (5) x @ones; # set all elements to 5
Perl qw() Function
Cited from Using the Perl qw() function
Any non-alphanumeric, non-whitespace delimiter can be used to surround the qw() string argument.
The following are equivalent:
@names = qw(Kernighan Ritchie Pike);
@names = qw/Kernighan Ritchie Pike/;
@names = qw'Kernighan Ritchie Pike';
@names = qw{Kernighan Ritchie Pike};
No interpolation is possible in the string you pass to qw().
Any non-alphanumeric, non-whitespace delimiter can be used to surround the qw() string argument.
The following are equivalent:
@names = qw(Kernighan Ritchie Pike);
@names = qw/Kernighan Ritchie Pike/;
@names = qw'Kernighan Ritchie Pike';
@names = qw{Kernighan Ritchie Pike};
No interpolation is possible in the string you pass to qw().
Wednesday, 16 March 2016
Signal Artifact Blacklist Regions
A comprehensive collection of signal artif act blacklist regions
Toos to remove reads in the blacklist from bam files:
bamutils filter
Toos to remove reads in the blacklist from bam files:
bamutils filter
Monday, 14 March 2016
Bash Run a Given Function in Parallel
How to run given function in Bash in parallel?
GNU Parallel and Bash functions: How to run the simple example from the manual
Bash script processing commands in parallel
parallel -j $NSLOTS -q --pipe <commands>
GNU Parallel and Bash functions: How to run the simple example from the manual
Bash script processing commands in parallel
parallel -j $NSLOTS -q --pipe <commands>
Thursday, 10 March 2016
Git Commands
Cited from Ry’s Git Tutorial
git --verison
to turn a directory into a Git repository
cd [dirname]; git init
A
An untracked file is one that is not under version control.
You should only track source files and omit anything that can be generated from those files.
A snapshot represents the state of your project at a given point in time.
Git’s term for creating a snapshot is called staging.
The git status command will only show us uncommitted changes. To view our project history, git log.
To tell Git who we are,
The
Another useful configuration is to pass a filename to git log filename to display file-specific history.
git checkout <commit-id>
View a previous commit.
Tags are convenient references to official releases and other significant milestones in a software project. It lets developers easily browse and check out important revisions. For example, we can now use the v1.0 tag to refer to the third commit instead of its random ID. To view a list of existing tags, execute git tag without any arguments.
git tag -a v1.0 -m "message"
Never make changes directly to a previous revision.
When using git revert, remember to specify the commit that you want to undo—not the stable commit that you want to return to. It helps to think of this command as saying “undo this commit” rather than “restore this version.”
In Git, a branch is an independent line of development.
The HEAD is Git’s internal way of indicating the snapshot that is currently checked out.
To create a new branch,
git branch branch-name
To checkout a branch,
git checkout branch-name
When the history of two branches diverges, a dedicated commit is required to combine the branches. This situation may also give rise to a merge conflict, which must be manually resolved before anything can be committed to the repository.
Conflicts occur when we try to merge branches that have edited the same content.
###################################################################
git --verison
to turn a directory into a Git repository
cd [dirname]; git init
A
.git directory stores all the tracking data for our repository.An untracked file is one that is not under version control.
You should only track source files and omit anything that can be generated from those files.
git add
command tells Git to add the file to the repository.A snapshot represents the state of your project at a given point in time.
Git’s term for creating a snapshot is called staging.
The git status command will only show us uncommitted changes. To view our project history, git log.
To tell Git who we are,
git
config
--global
user.name
"Your Name"
git
config
--global
user.email
your.email@example.com
The
--global
flag tells Git to use this configuration as a default for
all of your repositories. Omitting it lets you specify different user
information for individual repositories.Another useful configuration is to pass a filename to git log filename to display file-specific history.
git checkout <commit-id>
View a previous commit.
Tags are convenient references to official releases and other significant milestones in a software project. It lets developers easily browse and check out important revisions. For example, we can now use the v1.0 tag to refer to the third commit instead of its random ID. To view a list of existing tags, execute git tag without any arguments.
git tag -a v1.0 -m "message"
Never make changes directly to a previous revision.
When using git revert, remember to specify the commit that you want to undo—not the stable commit that you want to return to. It helps to think of this command as saying “undo this commit” rather than “restore this version.”
In Git, a branch is an independent line of development.
The HEAD is Git’s internal way of indicating the snapshot that is currently checked out.
To create a new branch,
git branch branch-name
To checkout a branch,
git checkout branch-name
When the history of two branches diverges, a dedicated commit is required to combine the branches. This situation may also give rise to a merge conflict, which must be manually resolved before anything can be committed to the repository.
Conflicts occur when we try to merge branches that have edited the same content.
###################################################################
Monday, 7 March 2016
Sunday, 6 March 2016
Saturday, 5 March 2016
Awk/Gawk Verion
Cited from How can I find my awk version
gawk -Wversion 2>/dev/null || gawk --version
gawk -Wversion 2>/dev/null || gawk --version
Pass Arguments to a Bash Script
How to pass arguments to a Bash-script
How to handle multiple input file arguments using getopts
Command Line Options: How To Parse In Bash Using “getopt”
Bash getopt versus getopts
How to use getopt in bash command line with only long options?
Small getopts tutorial
How to handle multiple input file arguments using getopts
Command Line Options: How To Parse In Bash Using “getopt”
Bash getopt versus getopts
How to use getopt in bash command line with only long options?
Small getopts tutorial
Tuesday, 1 March 2016
Subscribe to:
Posts (Atom)