Thursday 27 December 2018

Groovy replaceAll Capturing Groups

Cited from Groovy Goodness: Using the replaceAll Methods from String

replaceAll(String, Closure)

Tuesday 6 November 2018

Homolog vs Paralog vs Ortholog

Cited from What is the difference between orthologs, paralogs and homologs?

Homology is the blanket term, both ortho- and paralogs are homologs. So, when in doubt use "homologs". However:
  • Orthologs are homologous genes that are the result of a speciation event.
  • Paralogs are homologous genes that are the result of a duplication event.

Wednesday 24 October 2018

PRC1 and PRC2

Cited from the paper "A distal intergenic region controls pancreatic endocrine differentiation by acting as a transcriptional enhancer and as a polycomb response element"

PcG-dependent repression is mediated by two major complexes known as PRC1 and PRC2 that act in an interdependent manner. PRC2 contains Ezh2 and Ezh1 in its non-canonical form, both of which catalyzes the trimethylation of Lysine 27 on histone 3 (H3K27me3). This histone mark contributes to the recruitment of the PRC1 complex, which promotes H2A ubiquitylation, interference with transcriptional machinery, and chromatin compaction [48].

Index Sorting

Cited from How to Use FlowJo’s Script Editor with Index Sorted Data

Index sorting is a method that deposits individual cells from a heterogenous mixture into wells of a plate (96, 384, and so on). Sorted cells are usually selected by some criteria (such as GFP+ and CD11b+, but CD3-) and then channeled into an empty well. Cells that do not meet the specified criteria are shunted to a waste tube.

The cytometer that sorts the cells will export a file that specifies the locations of sorted cells into their respective wells by using some sort of keyword (such as Index_Sort_Positions, Tray_X, Tray_Y, and so on).

Tuesday 23 October 2018

Fluorescence-activated Nuclear Sorting (FANS)

Cited from Cell Type-Specific Gene Expression Profiling Using Fluorescence-Activated Nuclear Sorting

Fluorescence-activated cell sorting (FACS) is a powerful method for the analysis of cell type-specific transcriptome profiles, DNA or histone modifications, and chemical compounds. 

However, many tissues are recalcitrant to cell separation and are therefore not readily accessible for FACS analysis. Here, we lay out a detailed protocol for the generation of transcriptional profiles from fluorescently labeled nuclei.

Here, the nuclei of cells of interest are labeled by transgenic expression of nuclear-localized fluorescent protein constructs and are separated from surrounding unlabeled nuclei from a crude extract by a cell sorter.

Most techniques of nuclear sorting use some kind of tissue fixation. The isolated polyadenylated RNA therefore most likely is composed of newly transcribed mRNA about to be exported from the nucleus as well as cytoplasmic RNA that was cross-linked to the nuclear envelope.

Gene and Protein Names in Humans and Mice

Cited from Guidelines for Formatting Gene and Protein Names

In general, symbols for genes are italicized (e.g., IGF1), whereas symbols for proteins are not italicized (e.g., IGF1). 

Although the general rule that gene symbols are italicized and protein symbols are not italicized holds true regardless of the type of organism, there are several variations among organisms in the composition and capitalization of alphanumeric characters within the gene and protein symbols.

Humans, non-human primates, chickens, and domestic species: Gene symbols contain three to six italicized characters that are all in upper-case (e.g., AFP). Gene symbols may be a combination of letters and Arabic numerals (e.g., 1, 2, 3), but should always begin with a letter; they generally do not contain Roman numerals (e.g., I, II, III), Greek letters (e.g., α, β, γ), or punctuation. Protein symbols are identical to their corresponding gene symbols except that they are not italicized (e.g., AFP).

Mice and rats: Gene symbols are italicized, with only the first letter in upper-case (e.g., Gfap). Protein symbols are not italicized, and all letters are in upper-case (e.g., GFAP).

Fish: In contrast to the general rule, full gene names are italicized (e.g., brass). Gene symbols are also italicized, with all letters in lower-case (e.g., brs). Protein symbols are not italicized, and the first letter is upper-case (e.g., Brs).

Flies: Gene names and symbols begin with an upper-case letter if: (1) the gene is named for a protein or (2) the gene was first named for a mutant phenotype that is dominant to the wild-type phenotype (e.g., Rpp30). Gene names and symbols begin with a lower-case letter if the gene was first named for a mutant phenotype that is recessive to the wild-type phenotype (e.g., kis). Gene symbols are italicized. Symbols for proteins that were named for genes begin with an upper-case letter, but there are no accepted formatting guidelines for proteins that were not named for genes. Protein symbols are not italicized.

Worms: Gene symbols are italicized and generally composed of three to four letters, a hyphen, and an Arabic number (e.g., abu-1). Protein symbols are not italicized, and all letters are in upper-case (e.g., ABU-1).

Bacteria: Gene symbols are typically composed of three lower-case, italicized letters that serve as an abbreviation of the process or pathway in which the gene product is involved (e.g., rpo genes encode RNA polymerase). To distinguish among different alleles, the abbreviation is followed by an upper-case letter (e.g., the rpoB gene encodes the β subunit of RNA polymerase). Protein symbols are not italicized, and the first letter is upper-case (e.g., RpoB).




Friday 12 October 2018

Pre or Pro Protein/Domain

Cited from Re: What is the function of a protein's prodomain? Why are they called that?

"pre-" was adopted as the prefix that means "before N-terminal cleavage during a secretion event".

Many proteins require additional proteolytic processing to become fully functional. This also involves cleavage of a peptide from the N-terminus, but is not part of the secretory process. So the prefix "pro-" was adopted to describe the protein prior to this processing event.

"preproprotein" (the N-terminal is cleaved during secretion, then cleaved again to make the protein active).

Caspase is a good example. It is synthesised as an inactive proprotein. In this case the term "prodomain" is used, because the N-terminal region is folded into a discrete structural unit with a specific function (the definition of a domain). Usually the prodomain mediates interaction with other proteins in a complex. Proteolytic cleavage then removes the prodomain, activates the caspase and triggers a cascade in which caspases activate other caspases, leading eventually to the cleavage of key target proteins (caspase substrates) and cell death. Other proteins besides caspases also contain prodomains that mediate a particular process for the protein.

Tuesday 28 August 2018

STAR Aligner In Retion to Phred Quality Encoding

Cited from Phred64 encoding consideration in STAR

"""
STAR does not need to phred quality encoding, at the moment it does not actually use quality scores for mapping. It will simply copy the quality strings into SAM output. If you want to convert the quality scores output is SAM for some reason, you can use --outQSconversionAdd <positive or negative number>.
"""

Monday 27 August 2018

Interact with NCBI SRA using aspera ascp command line client

Interact with NCBI SRA using aspera ascp command line client

Download All Files From A Directory

wget – recursively download all files from certain directory listed by apache

Use of Ascp to Download SRA/Fastq from EBI/NCBI

使用Aspera从EBI或NCBI下载基因组数据

Explanation of SRA

NCBI SRA数据库使用详解

Downloading SRA Data Using Command Line Utilities from NCBI

Downloading SRA data using command line utilities

Downloading SRA files from NCBI

linux下用Aspera从NCBI上下载SRA格式宏基因组数据



Alignment Sam to Sequence Fastq Files

How to convert SAM to FASTQ with Unix command line tools

Check If a File Exists on a FTP Server

bash wget - check if file exists at url before downloading

wget: check if the file exists, but not download

Multi-line Comments in Bash

Multiline shell script comments - how does this work?

Wednesday 22 August 2018

Nextflow SGE Executor Memory Specification

Cited from process memory

process my_process {
    memory {16.GB * task.attempt}                                                                                                                           
    clusterOptions "-l h_vmem=${memory.toString().replaceAll(/[\sB]/,'')}"
    ...
} 
 
Cited from Configuration

perJobMemLimit Specifies Platform LSF per-job memory limit mode

Friday 17 August 2018

Customize Bash Prompt String

  1. Install sexy-bash-prompt
  2. Modifying the prompt symbol ('PROMPT_SYMBOL') in ~/.bash_prompt
    In Bash 4.2+, add the following line at the beginning.
    PROMPT_SYMBOL=$'\u21AA'
    In Bash 4.1 or earlier, add the following line at the beginning.
    PROMPT_SYMBOL=$'\xe2\x86\xaa'
  3. Then source ~/.bash_prompt
  4. UTF8 code for additional symbols may be found at UTF-8 Arrows
  5. Assigning UTF8 code to variables was demonstrated in Awesome symbols and characters in a bash prompt
  6. Hex code for UTF8 symbols can be found through echo ↪ | hexdump -C, for example.

PS1 vs PROMPT_COMMAND

Cited from What is the difference between PS1 and PROMPT_COMMAND

"""
The difference is that PS1 is the actual prompt string used, and PROMPT_COMMAND is a command that is executed just before the prompt.
"""

Thursday 2 August 2018

Nextflow Task Specific Execution Path

Cited from Nextflow: a tutorial through examples

echo "Path is \$( pwd )\n "

Specifying $baseDir
 
"make sure the escape the $ with a backslash in your script"

echo "Path is \$PWD\n "

RNA-seq Strand-specific Terminology

Cited from Trinity strand specific: RF or FR

fr-firststrand (tophat) = RF (trinity)
fr-secondstrand (tophat) = FR (trinity)

Friday 20 July 2018

Python Exception Class

Cited from the book "Python How to Program"

""
All exceptions inherit from base class Exception and are defined in module exceptions. Python automatically places all exception names in the built-in namespace, so programs do not need to import the exceptions module to use exceptions. Python defines four primary classes that inherit from Exception -- SystemExit, StopIteration, Warning and StandardError. Exception SystemExit, when raised and left uncaught, terminates program execution. If an interactive session encounters an uncaught SystemExit exception, the interactive session terminates. Python uses exception StopIteration to determine when a for loop reaches the end of its sequence. Python uses
Warning exceptions to indicate that certain elements of Python may change in the future. StandardError is the base class for all Python error exceptions (e.g., ValueError and ZeroDivisionError).
""




Python Try Statement

Cited from the book "Python How to Program"

Python uses try statements to enable exception handling. The try statement
encloses other statements that potentially cause exceptions. A try statement begins with
keyword try, followed by a colon (:), followed by a suite of code in which exceptions
may occur. The try statement may specify one or more except clauses that immediately
follow the try suite. Each except clause specifies zero or more exception class names
that represent the type(s) of exceptions that the except clause can handle. An except
clause (also called an except handler) also may specify an identifier that the program can
use to reference the exception object that was caught. The handler can use the identifier to
obtain information about the exception from the exception object. An except clause that
specifies no exception type is called an empty except clause. Such a clause catches all
exception types. After the last except clause, an optional else clause contains code that
executes if the code in the try suite raised no exceptions. If a try statement specifies no
except clauses, the statement must contain a finally clause, which always executes,
regardless of whether an exception occurs. We discuss each possible combination of
clauses over the next several sections.

It is a syntax error to write a try statement that contains except and finally clauses.
The only acceptable forms are try/except, try/except/else and try/finally.


Python Raising an Exception

Cited from the book "Python How to Program"

"""
The simplest form of raising an exception consists of the keyword raise, followed by
the name of the exception to raise. Exception names identify classes; Python exceptions are
objects of those classes. When a raise statement executes, Python creates an object of the
specified exception class. The raise statement also may specify arguments that initialize
the exception object. To do so, follow the exception class name with a comma (,) and the
argument (or a tuple of arguments). Programs can use an exception object’s attributes to dis-
cover more information about the exception that occurred. The raise statement has many
forms.

The arguments used to initialize an exception object can be referenced in an exception handler to perform an appropriate task.

An exception can be raised without passing arguments to initialize the exception object. In
this case, knowledge that an exception of this type occurred normally provides sufficient in-
formation for the handler to perform its task.

When code in a program causes an exception, or when the Python interpreter detects a
problem, the code or the interpreter raises (or throws) an exception. Some programmers
refer to the point in the program at which an exception occurs as the throw point—an
important location for debugging purposes. Exceptions are objects of classes that inherit from class Exception. If an exception occurs in a try,suite, the try suite expires (i.e., terminates immediately), and program control transfers to the first except handler (if there is one) following the try suite.

Next, the interpreter searches for the first except handler that can process the type of exception that occurred. The interpreter locates the matching except by comparing the raised exception’s type to
each except’s exception type(s) until the interpreter finds a match. A match occurs if the
types are identical or if the raised exception’s type is a derived class of the handler’s excep-
tion type. If no exceptions occur in a try suite, the interpreter ignores the exception han-
dlers for the try statement and executes the try statement’s else clause (if the statement
specifies an else clause). If no exceptions occur, or if one of the except clauses success-
fully handles the exception, program execution resumes with the next statement after the
try statement. If an exception occurs in a statement that is not in a try suite and that state-
ment is in a function, the function containing that statement terminates immediately and the
interpreter attempts to locate an enclosing try statement in a calling code—a process
called stack unwinding.

Python is said to use the termination model of exception handling, because the try
suite that raises an exception expires immediately when that exception occurs.
"""

Thursday 19 July 2018

Python __slot__ Attribute

Cited from the book "Python How to Program"

"""

When a new class defines the __slots__ attribute, objects of the class can assign values only to attributes whose names appear in the __slots__ list. If a client attempts to assign a value to an attribute whose name does not appear in __slots__, Python raises an exception.

If a new class defines attribute __slots__, but the class’s constructor does not initialize
the attributes’ values, Python assigns None to each attribute in __slots__ when an object
of the class is created.

A derived class inherits its base-class __slots__ attribute. However, if programs should
not be allowed to add attributes to objects of the derived class, the derived class must define
its own __slots__ attribute. The derived-class __slots__ contains only the allowed
derived-class attribute names, but clients still can set values for attributes specified by the
derived class’s direct and indirect bases classes.
"""

~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
Cited from Usage of __slot__

class Base:
    __slots__ = 'foo', 'bar'

class Right(Base):
    __slots__ = 'baz', 

class Wrong(Base):
    __slots__ = 'foo', 'bar', 'baz'        # redundant foo and bar


Python Static Methods

Cited from the book "Python How to Program"

In Python 2.2 all classes (not only classes that inherit from object) can define static methods. A static method can be called by a client of the class, even if no objects of the class exist. Typically, a static method is a utility method of a class that does not require an object of the class to execute.

A class designates a method as
static by passing the method’s name to built-in function staticmethod and binding a
name to the value returned from the function call. Static methods differ from regular
methods because, when a program calls a static method, Python does not pass the object-
reference argument to the method. Therefore, a static method does not specify self as
the first argument. This allows a static method to be called even if no objects of the class
exist.

Static methods can be called either by using the class name in which the method is
defined or by using the name of an object of that class. Function main (lines 36–58) dem-

When a method of a class does not require an object of the class to perform its task,
the programmer designates that method as static.





Tuesday 17 July 2018

Python __str__ Method

Cited from the book "Python How to Program"

For example,
hourly = HourlyWorker( "Bob", "Smith", 40.0, 10.00 ) # create an object from the class

 print hourly # invoke __str__ implicitly
 print hourly.__str__() # invoke __str__ explicitly
 print HourlyWorker.__str__( hourly ) # explicit, unbound call, and the __str__ method defined takes only one argument 'self'.




Python __dict__ Attribute

Cited from the book "Python How to Program"

Recall that an object of a class has a namespace that contains identifiers and their values. Attribute __dict__ contains this information for each object.

Python __bases__ Attribute

Cited from the book "Python How to Program"

attribute (lines 33–34). Recall from Chapter 7 that each class contains special attributes,
including __bases__, which is a tuple that contains references to each of the class’s base
classes.

Python Unbound and Bound Method Calls

Cited from the book "Python How to Program"

A bound method call is invoked by accessing the method name through an object (e.g., anObject.method()). We have seen that Python inserts the object-reference argument for bound method calls. An unbound method call is invoked by accessing the method through its class name and specifically passing an object reference.




Python issubclass and isinstance Methods

Python provides two built-in functions—issubclass and isinstance— that enable us to determine whether one class is derived from another class and whether a value is an object of a particular class or of a subclass of that class.

Friday 13 July 2018

Operator Overloading

Cited from the book "Python How to Program"

Operators are overloaded by writing a method definition as you normally would, except
that the method name corresponds to the Python special method for that operator. For
example, the method name __add__ overloads the addition operator (+). To use an operator
on an object of a class, the class must overload (i.e., define a special method for) that operator.

It is not possible to change the “arity” of an operator (i.e., the number of operands an
operator takes)—overloaded unary operators remain unary operators, and overloaded
binary operators remain binary operators. Operators + and - each have both unary and
binary versions; these unary and binary versions can be overloaded separately, using dif-
ferent method names. It is not possible to create new operators; only existing operators can
be overloaded.

The meaning of how an operator works on objects of built-in types cannot be changed
by operator overloading. The programmer cannot, for example, change the meaning of how
+ adds two integers. Operator overloading works only with objects of user-defined classes
or with a mixture of an object of a user-defined class and an object of a built-in type.

Overloading a binary mathematical operator (e.g., +, -, *) automatically overloads the
operator’s corresponding augmented assignment statement. For example, overloading an
addition operator to allow statements like

object2 = object2 + object1

implies that the += augmented assignment statement also is overloaded to allow statements
such as

object2 += object1

Although (in this case) the programmer does not have to define a method to overload the
+= assignment statement, such behavior also can be achieved by defining the method ex-
plicitly for that class.

A unary operator for a class is overloaded as a method that takes only the object reference
argument (self). When overloading a unary operator (such as ~) as a method, if object1 is an object of class Class, when the interpreter encounters the expression ~object1 the interpreter generates the call
object1.__invert__().

A binary operator or statement for a class is overloaded as a method with two arguments:
self and other. Later in this chapter, we will overload the + operator to indicate addi-
tion of two objects of class Rational. When overloading binary operator +, if y and z
are objects of class Rational, then y + z is treated as if y.__add__( z ) had been
written, invoking the __add__ method. If y is not an object of class Rational, but z is
an object of class Rational, then y + z is treated as if z.__radd__( y ) had been
written. The method is named __radd__, because the object for which the method exe-
cutes appears to the right of the operator. Usually, overloaded binary operator methods cre-
ate and return new objects of their corresponding class.

When overloading assignment statement += as a Rational method that accepts two
arguments, if y and z are objects of class Rational, then y += z is treated as if
y.__iadd__( z ) had been written, invoking the __iadd__ method. The method is
named __iadd__, because the method performs its operations “in-place” (i.e., the
method uses no extra memory to perform its behavior). Usually, this means that the method
performs any necessary calculations on the object reference argument (self), then returns
the updated reference.

What happens if we evaluate the expression y + z or the statement y += z, and only
y is an object of class Rational? In both cases, z must be coerced (i.e., converted) to an
object of class Rational, before the appropriate operator overloading method executes.

~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
Cited from A Guide to Python's Magic Methods

some_object + other
That was "normal" addition. The reflected equivalent is the same thing, except with the operands switched around:
other + some_object

Note that the object on the left hand side of the operator (other in the example) must not define (or return NotImplemented) for its definition of the non-reflected version of an operation. For instance, in the example, some_object.__radd__ will only be called if other does not define __add__.

~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
Cited from __radd__

Suppose you are implementing a class that you want to act like a number via operator overloading. So you implement __add__ in your class, and now expressions like myobj + 4 can work as you want and yield some result. This is because myobj + 4 is interpreted as myobj.__add__(4), and your custom method can do whatever it means to add 4 to your custom class.
However, what about an expression like 4 + myobj which is really (4).__add__(myobj)? The 4 is an instance of a Python built-in type and its __add__ method doesn't know anything about your new type, so it will return a special value NotImplemented. (The interpreter recognizes this special value coming from __add__ and raises a TypeError exception which kills your program, which is the behavior you'd actually see, rather than the special value being returned.)
It would suck for operator overloading if myobj + 4 was valid but 4 + myobj was invalid. That's arbitrary and restrictive — addition is supposed to be commutative. Enter __radd__. Python will first try (4).__add__(myobj), and if that returns NotImplemented Python will check if the right-hand operand implements __radd__, and if it does, it will call myobj.__radd__(4)rather than raising a TypeError. And now everything can proceed as usual, as your class can handle the case and implement your behavior, rather than the built-in type's __add__ which is fixed and doesn't know about your class.









Thursday 12 July 2018

Python __getattr__ Method

Cited from the book "Python How to Program"

When a client program contains the expression time1.attribute as an rvalue (i.e., the right-hand value in an operator expression), Python first looks in time1’s __dict__ attribute for the attribute name.

If the attribute name is in __dict__, Python simply returns the attribute’s value. If the attribute name is not in the object’s __dict__, Python generates the call time1.__getattr__( attribute ) where attribute is the name of the attribute that the client is attempting to access.

Python __setattr__ Method

Cited from the book "Python How to Program"

It is important that method __setattr__ uses an object’s __dict__ attribute to set an object’s attributes. If the statement inside __setattr__ method is used,

self._hour = value

method __setattr__ would execute again, with the arguments "_hour" and value, resulting in infinite recursion. Assigning a value through the object’s __dict__ attribute, however, does not invoke method __setattr__, but simply inserts the appropriate key–value pair in the object’s __dict__.



Python __str__ Method

Cited from the book "Python How to Program"

A Python class can define special method __str__, to provide an informal (i.e., human-readable) string representation of an object of the class. If a client program of the class contains the statement

print objectOfClass

Python calls the object’s __str__ method and outputs the string returned by that method.

Returning a non-string value from method __str__ is a fatal, runtime error.


Python Class Attribute

Cited from the book "Python How to Program"

Although class attributes may seem like global variables, each class attribute resides in the namespace of the class in which it is created. Class attributes should be initialized once (and only once) in the class definition. A class’s class attributes can be accessed through any object of that class. A class’s class attributes also exist even when no object of that class exists. To access a class attribute when no object of the class exists, prefix the class name, followed by a period, to the name of the attribute.



Python Destructor

Cited from the book "Python How to Program"

A destructor executes when an object is destroyed (e.g., after no more references to the object exist). A class can define a special method called __del__ that executes when the last reference to an object is deleted or goes out of scope. The method itself does not actually destroy the object -- it performs termination housekeeping before the interpreter reclaims the object’s memory, so that memory may be reused. A destructor normally specifies no parameters other than self and returns None.



Python Object Attribute Variable Names Beginning With Double Underscores

Cited from the book "Python How to Program"

When Python encounters an attribute name that begins with two underscores, the interpreter performs name mangling on the attribute, to prevent indiscriminate access to the data. Name mangling changes the name of an attribute by including information about the class to which the attribute belongs. For example, if the Time constructor contained the line self.__hour = 0. Python creates an attribute called _Time__hour, instead of an attribute called __hour.

Python Exception

Cited from the book "Python How to Program"

An exception is a Python object that indicates a special event (most often an error) has occurred.

raise ValueError, "Invalid hour value: %d" % hour

The keyword raise is followed by the name of the exception, a comma and a value that the exception
object stores as an attribute. When Python executes a raise statement, an exception is

raised; if the exception is not caught, Python prints an error message that contains the name

of the exception and the exception’s attribute value, as shown in Fig. 7.8.

Python Object Attribute Variable Names Beginning With a Single Underscore

Cited from the book "Python How to Program"

Attribute names that begin with a single underscore have no special meaning in the syntax of the Python language itself. However, the single leading underscore is a convention among Python programmers who use the class. When a class author creates an attribute with a single leading underscore, the author does not want users of the class to access the attribute directly. If a program requires access to the attributes, the class author provides some other means for doing so. In this case, we provide access methods through which clients should manipulate the data.

Python Trailing Commas

Cited from Python trailing comma after print executes next instruction

In Python 2.x, a trailing , in a print statement prevents a new line to be emitted.
  • In Python 3.x, use print("Hi", end="") to achieve the same effect.

Python Class Instance Object

Cited from the book "Python How to Program"

All methods, including constructors, must specify at least one parameter. This parameter represents the object of the class for which the method is called. This parameter often is referred to as the class instance object. This term can be confusing, so we refer to the first argument of any method as the object reference argument, or simply the object reference. Methods must use the object reference to access attributes and other methods that belong to the class. By convention, the object reference argument is called self.

Failure to specify an object reference (usually called self) as the first parameter in a method definition causes fatal logic errors when the method is invoked at runtime.

Python inserts the first (object reference) argument into every method call, including a class’s constructor call. 


Python Constructor Method of a Class

Cited from the book "Python How to Program"

The special method __init__ is the constructor method of the class. A constructor is a special method that executes each time an object of a class is created. The constructor (method __init__) initializes the attributes of the object and returns None. Python classes may define several other special methods, identified by leading and trailing double-underscores (__) in the name.

Returning a value other than None from a constructor is a fatal, runtime error.
 

Python Documentation String

Cited from the book "Python How to Program"

If a class contains a optional documentation string -- a string that describes the class, the string must appear in the line or lines following the class header. A user can view a class’s documentation string by executing the following statement.

print ClassName.__doc__

Modules, methods and functions also may specify a documentation string.

By convention, docstrings are triple-quoted strings. This convention allows the class author
to expand a program’s documentation (e.g., by adding several more lines) without having to
change the quote style.


How To Install Linux, Apache, MySQL, PHP (LAMP) stack on Ubuntu 16.04

How To Install Linux, Apache, MySQL, PHP (LAMP) stack on Ubuntu 16.04

How to Set Up a Dedicated Web Server for Free

Enable CGI on an Apache server

DIY: Enable CGI on your Apache server


Wednesday 11 July 2018

Python List DeepCopy

b_list = a_list[:] # deep copy

Python Dictionary ShallowCopy and DeepCopy

a_dict = dict.copy() # shallow copy

from copy import deepcopy
b_dict = deepcopy(dict) # deep copy

Python Methods

Cited from the book "Python How to Program"

All Python data types contain at least three properties: a value, a type and a location. Some Python data types (e.g., strings, lists and dictionaries) also contain methods. A method is a function that performs the behaviors (tasks) of an object.

To invoke the list method, specify the name of the list, followed by the dot (.) access operator, followed by the method call (i.e., method name and necessary arguments).

Function Arguments

Cited from the book "Python How to Program"

Default arguments must appear to the right of any non-default arguments in a function’s parameter list. When calling a function with two or more default arguments, if an omitted argument is not the rightmost argument in the argument list, all arguments to the right of that argument also must be omitted.

Python arguments are always passed by object reference—the function receives references to the values passed as arguments. In practice, pass-by-object-reference can be thought of as a combination of pass-by-value and pass-by-reference. If a function receives a reference to a mutable object (e.g., a dictionary or a list), the function can modify the original value of the object. It is as if the object had been passed by reference. If a function receives a reference to an immutable object (e.g., a number, a string or a tuple, whose elements are immutable values), the function cannot modify the original object directly. It is as if the object had been passed by value.






Python Namespace

Cited from the book "Python How to Program"

Namespaces store information about an identifier and the value to which it is bound. Python defines three namespaces: local, global and built-in. When a program attempts to access an identifier’s value, Python searches the namespaces in a certain order—local, global and built-in namespaces—to see whether and where the identifier exists.

If the function’s local namespace does not contain an identifier named x (e.g., the function does not define any parameters or create any identifiers named x), Python searches the next outer namespace: the global namespace (sometimes called the module namespace).

The global namespace contains the bindings for all identifiers, function names and class names defined within a module or file. Each module or file’s global namespace contains an identifier called __name__ that states the module’s name (e.g., "math" or "random"). When a Python interpreter session starts or when the Python interpreter begins executing a program stored in a file, the value of __name__ is "__main__".

The built-in namespace contains identifiers that correspond to many Python functions and error messages. For example, functions raw_input, int and range belong to the built-in namespace. Python creates the built-in namespace when the interpreter starts, and programs normally do not modify the namespace (e.g., by adding an identifier to the namespace). Identifiers contained in built-in namespaces may be accessed by code in programs, modules or functions.

Python provides a way for programmers to determine what identifiers are available from the current namespace. Built-in function dir returns a list of these identifiers.

Python provides many other introspective capabilities, including functions globals and locals that return additional information about the global and local namespaces, respectively.

>>> import math
>>> dir()
['__builtins__', '__doc__', '__name__', 'math']
>>> print math
<module 'math' (built-in)>
>>> dir( math )
['__doc__', '__name__', 'acos', 'asin', 'atan', 'atan2', 'ceil',
'cos', 'cosh','e', 'exp', 'fabs', 'floor', 'fmod', 'frexp', 'hy-
pot', 'ldexp', 'log', 'log10','modf', 'pi', 'pow', 'sin', 'sinh',
'sqrt', 'tan', 'tanh']

Python provides the from/import statement to import one or more identifiers from a module directly into the program’s namespace.

When the interpreter executes the line

from math import sqrt

the interpreter creates a reference to function math.sqrt and places the reference directly into the session’s namespace. Now, we can call the function directly without using the dot operator.

>>> from math import sqrt
>>> dir()
['__builtins__', '__doc__', '__name__', 'sqrt']
>>> sqrt( 9.0 )
3.0
>>> from math import sin, cos, tan
>>> dir()
['__builtins__', '__doc__', '__name__', 'cos', 'sin', 'sqrt',
'tan']

The statement

from math import *

imports all identifiers that do not start with an underscore from the math module into the interactive session’s namespace.

>>> import random as randomModule
>>> dir()
['__builtins__', '__doc__', '__name__', 'randomModule']
>>> randomModule.randrange( 1, 7 )
1
>>> from math import sqrt as squareRoot
>>> dir()
['__builtins__', '__doc__', '__name__', 'randomModule', 'square-
Root']
>>> squareRoot( 9.0 )
3.0






Python None

Cited from the book "Python How to Program"

None is a Python value that represents null—indicating that no value has been declared—and evaluates to false in conditional expressions.


Importing Functions from a Module

Cited from the book "Python How to Program"

To use a function that is defined in a module, a program must import the module, using keyword import. After the module has been imported, the program can invoke functions in that module, using the module’s name, a dot (.) and the function call (i.e., moduleName.functionName()).

Program Components in Python

Cited from the book "Python How to Program"

"""
Program components in Python are called functions, classes, modules and packages. Typically, Python programs are written by combining programmer-defined (programmer-created) functions and classes with functions or classes already available in existing Python modules. A module is a file that contains definitions of functions and classes. Many modules can be grouped together into a collection, called a package.
"""

Tuesday 10 July 2018

Returned Value of And Operator

Cited from the book "Python How to Program"

If a combined condition evaluates to false, the and operator returns the first value which evaluated to false. Conversely, if the combined condition evaluates to true, the and operator returns the last value in the condition.

Python Tips

Cited from the book "Python How to Program"

python -i program.py enters interactive mode after execution completion.

from __future__  import division
'/' : true division
'//': floor division

An expression containing and or or operators is evaluated until its truth or falsity is known. This is called short circuit evalutation.

Normally, control passes through the statement. If a print statement includes a function call without parentheses, it displays the memory location of the function. If the user intends to assign the result of a function call to a variable, a function call without parentheses binds the function itself to the variable.

Friday 6 July 2018

Installing Ruby on Rails in Ubuntu

Cited from Coursera Ruby on Rail Introduction

sudo apt-get install git gitk git-gui

sudo apt-get install gcc build-essential libpq-dev libssl-dev libreadline-dev libsqlite3-dev zlib1g-dev

git clone git@github.com:rbenv/rbenv.git .rbenv

echo 'export PATH="$HOME/.rbenv/bin:$PATH"' >> ~/.bashrc


Ruby on Rail Relevant Software

Cited from Coursera Ruby on Rails Introduction

Ruby: language interpreter
Rails: web application platform and dependencies
PhantomJS: headless Web testing support

Tuesday 26 June 2018

Join Variable Numbers of Paths in Groovy

Cited from Joining Paths in Java

Converts a path string, or a sequence of strings that when joined form a path string, to a Path.
...
Path representing an empty path is returned if first is the empty string and more does not contain any non-empty strings.

In Groovy, to print a variable number of arguments, the code is below.

import java.nio.file.*

String joinPath(String first, String... args){
Path path = Paths.get(first, args)
path.toString()
}

println joinPath('a')
println joinPath('a', 'b', 'c')
println joinPath('~/Data')

java.nio.file.Path Class in Groovy

Cited from Groovy Goodness: Extra Methods for NIO Path

Groovy adds a lot of extra methods to the File object to work with the contents or find and filter files in a directory. These methods are now also added to the java.nio.file.Path class since Groovy 2.3.

In a groovy script, the java.nio.file.Path class can be imported as the following. 

import java.nio.file.*

Wednesday 20 June 2018

What is a Read Group?

Cited from Read Groups

There is no formal definition of what is a read group, but in practice, this term refers to a set of reads that were generated from a single run of a sequencing instrument.

In the simple case where a single library preparation derived from a single biological sample was run on a single lane of a flowcell, all the reads from that lane run belong to the same read group. When multiplexing is involved, then each subset of reads originating from a separate library run on that lane will constitute a separate read group.

Strings in Groovy

Groovy: Ways to represent String

How to convert String to GString and replace placeholder in Groovy?

Tuesday 12 June 2018

Installing Docker From Ubuntu 18.04 Repository

Cited from How to Install Docker On Ubuntu 18.04 Bionic Beaver

sudo apt install docker.io$ 
sudo systemctl start docker
sudo systemctl enable docker
docker --version

Nextflow Tutorial

Nextflow Workshop 2017

Nextflow Tutorial ACGT14

Nextflow Resources


Mutational Signature Analyses

What does the plot of mutational signature analysis mean?

Interpreting the signature results

Why are there only 96 triplets?

"""
This classification has 96 possible mutations since there are six classes of base substitution C>A, C>G, C>T, T>A, T>C, T>G (all substitutions are referred to by the pyrimidine of the mutated Watson-Crick base pair) and the bases immediately 5’ and 3’ to each mutated base are incorporated.

Wednesday 6 June 2018

Java – Static Class, Block, Methods and Variables

Cited from Java – Static Class, Block, Methods and Variables

"""
Static keyword can be used with class, variable, method and block. Static members belong to the class instead of a specific instance, this means if you make a member static, you can access it without object. Let’s take an example to understand this:

Here we have a static method myMethod(), we can call this method without any object because when we make a member static it becomes class level. If we remove the static keyword and make it non-static then we must need to create an object of the class in order to call it.

Static block is used for initializing the static variables.This block gets executed when the class is loaded in the memory. A class can have multiple Static blocks, which will execute in the same sequence in which they have been written into the program.
  • Static variables are also known as Class Variables.
  • Unlike non-static variables, such variables can be accessed directly in static and non-static methods.
Static Methods can access class variables(static variables) without using object(instance) of the class, however non-static methods and non-static variables can only be accessed using objects.
Static methods can be accessed directly in static and non-static methods.

A class can be made static only if it is a nested class.
  1. Nested static class doesn’t need reference of Outer class
  2. A static class cannot access non-static members of the Outer class
class JavaExample{
   private static String str = "BeginnersBook";

   //Static class
   static class MyNestedClass{
 //non-static method
 public void disp() {

    /* If you make the str variable of outer class
     * non-static then you will get compilation error
     * because: a nested static class cannot access non-
     * static members of the outer class.
     */
    System.out.println(str); 
 }

   }
   public static void main(String args[])
   {
       /* To create instance of nested class we didn't need the outer
 * class instance but for a regular nested class you would need 
 * to create an instance of outer class first
        */
 JavaExample.MyNestedClass obj = new JavaExample.MyNestedClass();
 obj.disp();
   }
}
"""

Static Variable

Cited from Java – static variable with example

"""
A static variable is common to all the instances (or objects) of the class because it is a class level variable. In other words you can say that only a single copy of static variable is created and shared among all the instances of the class. Memory allocation for such variables only happens once when the class is loaded in the memory.

Static variable can be accessed directly in a static method.

Static variable initialization
  1. Static variables are initialized when class is loaded.
  2. Static variables are initialized before any object of that class is created.
  3. Static variables are initialized before any static method of the class executes.
If a variable is 'final', the value of this variable can never be changed in the current or in any class.

Final variable always needs initialization, if you don’t initialize it would throw a compilation error.
"""