Questions tagged [biopython]

Biopython is a set of freely available tools for biological computation written in Python. Please only use this tag for issues relating to the Biopython suite of tools.

0
votes
1answer
27 views

How to get the sequence counts (in fasta) with conditions using python?

I have a fasta file (fasta is a file in which header line starts with > followed by a sequence line corresponding to that header). I want to get the counts for sequences matching TRINITY and total ...
1
vote
0answers
33 views

How can I run a pubmed query in a Django app deployed on Elastic Beanstalk?

I wrote a Django app to query the Pubmed database using the Entrez tool provided by the Biopython package. Everything runs smoothly local. After deploying on AWS Elastic Beanstalk I get a "Permission ...
1
vote
0answers
48 views

SeqIO.parse Biopython - which file format should I specify?

I am trying to extract information from a multi-fasta file (e.g. C/G/A/T count, CG%) using biopython. I keep running into trouble when I try to iterate over the file for each fasta sequence - I can ...
1
vote
1answer
57 views

How to use EPOST and than use ESEARCH in biopython?

I have a lit of gene ids: id_list = ["19304878", "18606172", "16403221", "16377612", "14871861", "14630660"] how I can take just the nucleotide sequence of this genes using EPOST and ESEARCH in ...
1
vote
1answer
26 views

Limiting the number of hits in a Biopython NCBIWWW Search

I'm working on trying to automate some BLAST searches. I need to pick up only the top three results from the BLAST results, however the parameter hitlist_size doesn't seem to be limiting my searches ...
2
votes
0answers
36 views

PDB files to angles to again converting in PDB format

I have PDB file then i am converting it in file which gives dihedral angles.But now i had modified some angles and need to convert these modified angles again to a new pdb file.Is there any library ...
1
vote
0answers
12 views

How to intialize a PDB.DSSP object correctly

I am trying to get DSSP of a PDB file but python is throwing File not found. I downloaded dssp from window's Ubuntu. Calling which dssp in the ubuntu terminal gives'/usr/bin/dssp' Calling dssp with ...
0
votes
0answers
17 views

PDB file parser not being created Python

I am trying to declare a PDB parser object however the object does not get created. Im showing the method where this is a problem I've downloaded BIOpython and scikit. The line that causes problems ...
0
votes
1answer
113 views

How can you analyse fna.gz in python?

I want to return the nth basepair given my fna.gz genome input. Theoretically it would work like this: allele = genome[14325] print(allele) #: G This is the code I have now: from Bio import SeqIO ...
1
vote
0answers
44 views

Trying to understand how importing modules (BioPython) in Sublime Text works

I am a beginner with coding in general. I have been enjoying using Sublime Text 3, but I've run into a problem that I can't figure out. I want to use the BioPython module in Sublime Text 3, but when I ...
0
votes
0answers
38 views

Multiple server resquest and file writing without waiting for answer

I'm doing a protein prediction program based on genomic data and at some point I need to send multiple request to a server and write the results in a file. I have around 100 request and file writing ...
-1
votes
0answers
13 views

How to select first few base pairs from a FASTA/TEXT/GENBANK file for sequence alignment

I have to select only first 1000 base pairs from a fasta file to do local alignment with bio.pairewise2 command. I dont know how to take first 1000 base pairs. Can anyone help with that?
0
votes
1answer
47 views

Show only dna alignment score in biopython

I have DNA sequence data. For instance, X="ACGGGT" Y="ACGGT" I want to know the alignment score, thus I used biopython pairwise2 function. For example, from Bio import pairwise2 from Bio.pairwise2 ...
0
votes
1answer
32 views

How do I blast a local query against a local database in python/biopython?

First of all, I want to come clean that I am a super beginner in programming. I have 2 zip files (containing one database each) and 4 fasta files (three containing a protein sequence each and one ...
0
votes
1answer
38 views

Bio.Motifs throws KeyError 'd'

I'm using Biopython to process some NGS data. But I meet a strange problem when I use motif module in Biopython. Here is the code. frame = pd.DataFrame({'Spacer': seqs1.values()}, index=seqs.keys()) ...
0
votes
1answer
40 views

biopython in anaconda, not jupyter notebook

I am trying to install biopython in Jupyter Notebook, Anaconda, Ubuntu 16.04. I follow the procedure in biopython website and it runs on python. Python 3.6.8 |Anaconda, Inc.| (default, Dec 30 2018, ...
0
votes
0answers
30 views

Transforming a string DNA in to a Seq object in Biopython

There are any way that i can transform a user input string in to a byopython Seq object? I try a lot of things and search in the google but not a answer. Thanks
0
votes
1answer
25 views

How to download _full_ RefSeq record using Efetch?

I have a problem downloading a full record from Nucleotide db. I use: from Bio import Entrez from Bio import SeqIO with Entrez.efetch(db="nuccore", rettype="gb", retmode="full", id="NC_007384") as ...
0
votes
0answers
30 views

Improve performance on counting of %letters in dic values over a loop (python)

I have tree entities in the process : A dictionnary (chimera) that contains one key (the sequence name) and a huge dna sequence composed of X ACGT letters: >>> chimera {'Chimera_seq': ...
0
votes
1answer
17 views

Convert e-utils command to equivalent in Bio.Entrez (BioPython)

I'm having some trouble to connect each command using BioPython. Can someone help me to transform this command line to the equivalent using BioPython? esearch -db assembly -query "GCF_002514765.1" | ...
0
votes
1answer
48 views

Get genome from NCBI with biopython

Python newby here. I want to download the genome sequence for genome (NC_007779.1) using BioPython packages Entrez and SeqIO. So far, I have this code: from Bio import Entrez from Bio import SeqIO ...
1
vote
1answer
25 views

Iterating through a series of GenBank genes and appending each gene's features to a list returns only the last gene

I'm having a problem with my code. I'm trying to iterate through the genbank file's list of genes using BioPython. Here's what it looks like: class genBank: gbProtId = str() gbStart = int() ...
0
votes
2answers
30 views

Substring multifasta file using python

I am trying to extract sequences from a multifasta file from position 2 to 8 (seeds of microRNAs). To do this I have written a small python script. The script works but I couldn't write an output file....
1
vote
1answer
59 views

How to print the first few records using SeqIO from Biopython

I have a fasta file that has several hundred records but I'm trying to return a table with just the first 20 records (record description, AA length, and name). My code is not working and I would ...
0
votes
0answers
33 views

How do I import biopython from php

I'm developing a web application in a remote server and I cannot find a way to import Biopython from a php script. First I got this error: ImportError: No module named Bio Then, to solve this issue ...
3
votes
1answer
66 views

Directly calling SeqIO.parse() in for loop works, but using it separately beforehand doesn't? Why?

In python this code, where I directly call the function SeqIO.parse() , runs fine: from Bio import SeqIO a = SeqIO.parse("a.fasta", "fasta") records = list(a) for asq in SeqIO.parse("a.fasta", "...
1
vote
1answer
13 views

Biopython Genbank.Record : trying to understand source code

I am writing a csv reader to generate Genbank files to capture annotations with sequence. First I used a Bio.SeqRecord and got correctly formatted output but the SeqRecord class lacks fields that I ...
0
votes
1answer
38 views

Retrieve data from GenBank with Bio.Entrez module

I am trying to solve one of the Rosalind challenges and I can't seem to find a way to retrieve data, within a specific time frame. http://rosalind.info/problems/gbk/ Do/How Do I modify Entrez....
2
votes
2answers
49 views

Extracting gene sequences from FASTA File?

I have the following code that reads a FASTA file with 10 gene sequences and return each sequences as a matrix. However the code seems to be missing on the very last sequence and I wonder why? file=...
0
votes
1answer
79 views

problems with pairwise blast in biopython

I try to run a pairwise blast between two sequences within a python script and using the biopython blast tools. I have no problems running a blast against a local database by adding parameter db='...
0
votes
1answer
48 views

Utilizing biopython NcbitblastnCommandline to extract Nonsynonymous substitutions

I'm trying to use NcbitblastnCommandline to blast a protein query against a nucleotide sequence, and then report the hit. The program ran without error. However, in the result, my query sequence ...
2
votes
1answer
68 views

Replacing all of instances of a letter in a column of a FASTA alignment file

I am writing a script which can replace all of the instances of an amino acid residue in a column of a FASTA alignment file. Using AlignIO, I just can read an alignment file and extract information ...
0
votes
1answer
38 views

Entrez (biopython): how to restrict the term search to a specific journal? (PubMed)

I want to obtain all the articles in a specific journal that are related to a specific term/topic. I am trying to do so through PubMed using the Entrez package contained in Biopython. The ...
0
votes
0answers
35 views

How to not truncate my protein sequence output

Using biopython, I've parsed a fasta file to get a list of protein sequences. However, I can only get the truncated version, so when I write them in excel, I do not have the whole protein sequence (...
1
vote
1answer
55 views

How to generate IUPAC code from nucleotides?

I want to find the IUPAC equivalent to 2 different nucleotides. Example: I have A and C and I want M. Or: I have R and T and I want D. Is there a method for doing that in Biopython? (It sound easy ...
3
votes
2answers
104 views

Fastest way to count instances of substrings in string Python3.6

I have been working on a program which requires the counting of sub-strings (up to 4000 sub-strings of 2-6 characters located in a list) inside a main string (~400,000 characters). I understand this ...
0
votes
2answers
71 views

Optimizing dictionary look up using dict.items() for large dataset

I am newbie and started coding in pyhton in the last few months. I have a script that takes a proteome (800 Kb file of 2850 strings) and check each individual protein (protein_string) against a large ...
-3
votes
1answer
78 views

Rosalind doesn't accept “Variables and Some Arithmetic” task

Link for the problem http://rosalind.info/problems/ini2/ Given: Two positive integers a and b, each less than 1000. Return: The integer corresponding to the square of the hypotenuse of the right ...
1
vote
1answer
65 views

Trim sequences based on alignment

I'm trying to edit an MSA (Multiple Sequence Alignment) file generated by ClustalW, to trim sequences before the consensus one, using BioPython. xxx refers to other bases not relevant here Here's ...
0
votes
1answer
119 views

How to solve HTTP Error 429 in BioPython?

I'm trying to use BioPython to acquire nucleotide sequences by inputting accession number and start and end positions. I need to acquire many sequences but the process was aborted just after 3 ...
0
votes
3answers
76 views

Read nucleotides in FASTA without using BioPython

I need to obtain the same output obtained with the following code, but without using BioPython. I'm stuck... Anyone could help me? Thanks!!! from Bio import SeqIO records = SeqIO.parse("data/...
0
votes
1answer
59 views

Is there a method to write one ambiguous rna sequence from multiple unambiguous rna sequences in Python 3?

I have many rna sequences of the same length. Now i want to create a function that will give me one line of ambiguous rna as output. So far i'm not finding any useful information on writing ambiguous ...
0
votes
1answer
144 views

How can I fix this error: “BiopythonWarning: Partial codon, len(sequence) not a multiple of three.”?

For an assignment, I need to write a code that translates an rna sequence ffrom a fasta file to an amino acid sequence. However, I keep getting the following warning message: " BiopythonWarning: ...
0
votes
2answers
69 views

motif finder in a text file using python

I have a big text file like this example: example: >chr9:128683-128744 GGATTTCTTCTTAGTTTGGATCCATTGCTGGTGAGCTAGTGGGATTTTTTGGGGGGTGTTA >chr16:134222-134283 ...
-1
votes
1answer
52 views

How to fix ''generator' object is not subscriptable" error when reading fasta file with BioPython

I am trying to open and read a fasta file and use only the first line from the input. Currently, I'm calling the first line and appending it to a list to use in a later function. However, I'm ...
0
votes
0answers
16 views

Trying to modify Thge Levenstein Distance or EditDitstanc epython Library

i would like modify the editdistance library . for exmple : in editdistance or Levenstein distance : This generates a value = 1 but i want the result for this sequence to be 0 str1 ="AAAA" str2 = ...
0
votes
0answers
47 views

Translating multiple rna sequences from a FASTA file into proteins in BioPython

I need to translate multiple unambiguous rna sequences present in one FASTA file in Biopython. How can i put the data from all rna sequences in a single code to translate it to proteins?
0
votes
1answer
33 views

substitution “None” to the object that returns the NameError

I have many xml files for parsing the data I am using python. For example, please regard the Object as the result of parsed xml data. Morever, the Object have Object_A that I want to parse. My ...
0
votes
1answer
38 views

Drawing multiple sequences from 1 file, based on shared fields in another file

I'm trying to run a python script to draw sequences from a separate file (merged.fas), in respect to a list (gene_fams_eggnog.txt) I have as output from another program. The code is as follows: from ...
0
votes
0answers
49 views

Read user inputted .fasta file and parse using Biopython?

I am trying to create a python script where the user can type in their FASTA file and that file will then be parsed using Biopython. I am struggling to get this to work. The script I have thus far is ...