Python write tab delimited file

From a file, i have taken a line, split the line into 5 columns using split[]. But i have to write those columns as tab separated values in an output file.

Lets say that i have l[1], l[2], l[3], l[4], l[5]...a total of 5 entries. How can i achieve this using python? And also, i am not able to write l[1], l[2], l[3], l[4], l[5] values to an output file.

I tried both these codes, both not working[i am using python 2.6]:

code 1:

with open['output', 'w']:
   print l[1], l[2], l[3], l[4], l[5] > output

code 2:

with open['output', 'w'] as outf:
   outf.write[l[1], l[2], l[3], l[4], l[5]]

$\begingroup$

I am working on a project using a fasta file. I am writing my command in nano within command-line and executing using python, also within my command-line.

I would like my command to provide me with a tab delimited file with three columns: first column should contain my sequence name, second column should provide me with my sequence length, and the third column should show the sequence itself.

I have written the following command so far within nano:

from Bio import SeqIO
import sys
for hello_fasta in SeqIO.parse[sys.argv[1], "fasta"]:

  list = hello_fasta.split["\t"]

  print hello_fasta.description
  print [len[hello_fasta.seq]]

For example, I would like my command to provide me with the desired output and with the following order: Gene name ; Gene length ; Gene seq

H0192X 26 FORUWOHRPPTRWFAWWEAKJNFWEJ

asked Nov 14, 2020 at 21:35

$\endgroup$

$\begingroup$

You can use a list and insert[] to add an element in a specific order, then expand the list with *. Or you can use join[].

from Bio import SeqIO
import sys

for hello_fasta in SeqIO.parse[sys.argv[1], "fasta"]:
  sequences = []
  sequences.insert[0, hello_fasta.description]
  sequences.insert[1, len[hello_fasta.seq]]
  sequences.insert[2, hello_fasta.seq]
  # option 1
  print[*sequences, sep='\t']
  # option 2
  print['\t'.join[map[str, sequences]]]

answered Nov 16, 2020 at 14:31

zorbaxzorbax

7594 silver badges11 bronze badges

$\endgroup$

$\begingroup$

Here's a solution using pandas if you want to save the tsv:

from Bio import SeqIO
import pandas as pd
from io import StringIO

example = """
>seq0
FQTWEEFSRAAEKLYLADPMKVRVVLKYRHVDGNLCIKVTDDLVCLVYRTDQAQDVKKIEKF
>seq1
KYRTWEEFTRAAEKLYQADPMKVRVVLKYRHCDGNLCIKVTDDVVCLLYRTDQAQDVKKIEKFHSQLMRLMELKVTDNKECLKFKTDQAQEAKKMEKLNNIFFTLM
>seq2
EEYQTWEEFARAAEKLYLTDPMKVRVVLKYRHCDGNLCMKVTDDAVCLQYKTDQAQDVKKVEKLHGK
>seq3
MYQVWEEFSRAVEKLYLTDPMKVRVVLKYRHCDGNLCIKVTDNSVCLQYKTDQAQDVK
>seq4
EEFSRAVEKLYLTDPMKVRVVLKYRHCDGNLCIKVTDNSVVSYEMRLFGVQKDNFALEHSLL
>seq5
SWEEFAKAAEVLYLEDPMKCRMCTKYRHVDHKLVVKLTDNHTVLKYVTDMAQDVKKIEKLTTLLMR
>seq6
FTNWEEFAKAAERLHSANPEKCRFVTKYNHTKGELVLKLTDDVVCLQYSTNQLQDVKKLEKLSSTLLRSI
>seq7
SWEEFVERSVQLFRGDPNATRYVMKYRHCEGKLVLKVTDDRECLKFKTDQAQDAKKMEKLNNIFF
>seq8
SWDEFVDRSVQLFRADPESTRYVMKYRHCDGKLVLKVTDNKECLKFKTDQAQEAKKMEKLNNIFFTLM
>seq9
KNWEDFEIAAENMYMANPQNCRYTMKYVHSKGHILLKMSDNVKCVQYRAENMPDLKK
>seq10
FDSWDEFVSKSVELFRNHPDTTRYVVKYRHCEGKLVLKVTDNHECLKFKTDQAQDAKKMEK
"""

# This example just happens to be a string, just load your
# fasta file using the method you're already using
example_records = SeqIO.parse[ StringIO[example], 'fasta']

# Dictionary to hold the data you eventually want in the tsv
data = {"Gene name" : list[],
        "Gene length" : list[],
        "Gene seq" : list[]}

# Append the necessary into the data dictionary
for record in example_records:
    data['Gene name'].append[record.description]
    data['Gene length'].append[len[record.seq]]
    data['Gene seq'].append[str[record.seq]]

# Convert your data into a pandas DataFrame and save as a tsv
gene_df = pd.DataFrame[data]
gene_df.to_csv["gene_info.tsv", sep = '\t', index = False]

This results in a tsv that looks like this:

$ head gene_info.tsv
Gene name       Gene length     Gene seq
seq0    62      FQTWEEFSRAAEKLYLADPMKVRVVLKYRHVDGNLCIKVTDDLVCLVYRTDQAQDVKKIEKF
seq1    106     KYRTWEEFTRAAEKLYQADPMKVRVVLKYRHCDGNLCIKVTDDVVCLLYRTDQAQDVKKIEKFHSQLMRLMELKVTDNKECLKFKTDQAQEAKKMEKLNNIFFTLM
seq2    67      EEYQTWEEFARAAEKLYLTDPMKVRVVLKYRHCDGNLCMKVTDDAVCLQYKTDQAQDVKKVEKLHGK
seq3    58      MYQVWEEFSRAVEKLYLTDPMKVRVVLKYRHCDGNLCIKVTDNSVCLQYKTDQAQDVK
seq4    62      EEFSRAVEKLYLTDPMKVRVVLKYRHCDGNLCIKVTDNSVVSYEMRLFGVQKDNFALEHSLL
seq5    66      SWEEFAKAAEVLYLEDPMKCRMCTKYRHVDHKLVVKLTDNHTVLKYVTDMAQDVKKIEKLTTLLMR
seq6    70      FTNWEEFAKAAERLHSANPEKCRFVTKYNHTKGELVLKLTDDVVCLQYSTNQLQDVKKLEKLSSTLLRSI
seq7    65      SWEEFVERSVQLFRGDPNATRYVMKYRHCEGKLVLKVTDDRECLKFKTDQAQDAKKMEKLNNIFF
seq8    68      SWDEFVDRSVQLFRADPESTRYVMKYRHCDGKLVLKVTDNKECLKFKTDQAQEAKKMEKLNNIFFTLM

Hopefully this helps!

answered Feb 7, 2021 at 21:21

$\endgroup$

How do you set a tab delimiter in Python?

A tab-delimited file uses just rwo punctuation rules to encode the data..
Each row is delimited by an ordinary newline character. This is usually the standard \n . ... .
Within a row, columns are delimited by a single character, often \t ..

What is a tab

A tab-delimited file is a well-known and widely used text format for data exchange. By using a structure similar to that of a spreadsheet, it also allows users to present information in a way that is easy to understand and share across applications - including relational database management systems.

How do you write a comma separated text file in Python?

Python Write CSV File.
First, open the CSV file for writing [ w mode] by using the open[] function..
Second, create a CSV writer object by calling the writer[] function of the csv module..
Third, write data to CSV file by calling the writerow[] or writerows[] method of the CSV writer object..

How do you write a list to a file in Python?

Steps to Write List to a File in Python.
Open file in write mode. Pass file path and access mode w to the open[] function. ... .
Iterate list using a for loop. Use for loop to iterate each item from a list. ... .
Write current item into the file. ... .
Close file after completing the write operation..

Chủ Đề