Run GRN Inference

GRN Inference Without Method Integration

In this section, we explain how to access datasets and infer gene regulatory networks (GRNs) using your method without integrating it into geneRNIB.

### 1. Download the Inference Datasets The inference datasets are stored on AWS and can be downloaded using the following command:

aws s3 sync s3://openproblems-data/resources/grn/grn_benchmark/inference_data resources/grn_benchmark/inference_data --no-sign-request

### 2. Available Datasets The available datasets include op, nakatake, replogle, adamson, and norman. Each dataset provides RNA data. The op dataset also includes ATAC data.

### 3. GRN Inference Guidelines When performing GRN inference, consider the following:

  • We evaluate only the top TF-gene pairs, currently limited to 50,000 edges, ranked by their assigned weight.

  • The inferred network should follow this format:

    Columns: - source: Transcription factor (TF) - target: Target gene - weight: Regulatory importance/likelihood score/etc.

### 4. Saving the Inferred Network Since geneRNIB works with AnnData, your inferred network should be saved in this format.

If your network is a pandas DataFrame with three columns (source, target, weight), you can save it as follows (replace grnboost2 and norman with your method and the dataset name used to infer the GRN):

For R, use the following approach: .. .. code-block:: r

net$weight <- as.character(net$weight)

output <- AnnData(

X = matrix(nrow = 0, ncol = 0), uns = list(

method_id = “grnboost2”, dataset_id = “norman”, prediction = net[, c(“source”, “target”, “weight”)]

)

)

output$write_h5ad(“save_to_file.h5ad”, compression = “gzip”)

### Next Steps Once you have inferred GRNs for one or more datasets, proceed to the next section to run the evaluation.