Log in

Omega — harnessing the power of large language models for bioimage analysis

  • Correspondence
  • Published:

From Nature Methods

View current issue Submit your manuscript

This is a preview of subscription content, log in via an institution to check access.

Access this article

Subscribe and save

Springer+ Basic
EUR 32.99 /Month
  • Get 10 units per month
  • Download Article/Chapter or Ebook
  • 1 Unit = 1 Article or 1 Chapter
  • Cancel anytime
Subscribe now

Buy Now

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Fig. 1: Harnessing the power of LLMs for bioimage analysis with Omega.

Code availability

The source code as well as instructions and documentation for Omega can be found on GitHub at https://github.com/royerlab/napari-chatgpt.

References

  1. OpenAI et al. Preprint at ar**v https://doi.org/10.48550/ar**v.2303.08774 (2023).

  2. Sofroniew, N. et al. napari: a multi-dimensional image viewer for Python. Zenodo https://doi.org/10.5281/zenodo.8115575 (2022).

  3. Pachitariu, M. & Stringer, C. Nat. Methods 19, 1634–1641 (2022).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  4. Weigert, M., Schmidt, U., Haase, R., Sugawara, K. & Myers, G. In 2020 IEEE Winter Conference on Applications of Computer Vision (WACV) 3655–3662 (IEEE, 2020); https://doi.org/10.1109/WACV45572.2020.9093435

  5. Solak, A. C., Royer, L. A. Janhangeer, A. R. & Kobayashi, H. royerlab/aydin: v0.1.15. Zenodo https://doi.org/10.5281/ZENODO.5654826 (2022).

  6. Harris, C. R. et al. Nature 585, 357–362 (2020).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  7. van der Walt, S. et al. PeerJ 2, e453 (2014).

    Article  PubMed  PubMed Central  Google Scholar 

  8. Bradski, G. Dr. Dobbs J. Softw. Tools Prof. Program. 25, 120–123 (2000).

    Google Scholar 

  9. Lam, S. K., Pitrou, A. & Seibert, S. In Proc. Second Workshop on the LLVM Compiler Infrastructure in HPC 1–6 (Association for Computing Machinery, 2015).

  10. Nishino, R. & Loomis, S. H. C. In 31st Conf. Neural Inf. Process. Syst. 151 (Curran Associates, 2017).

Download references

Acknowledgements

Thanks to L. Kilpatrick and M. A. Kittisopikul for facilitating early API access to GPT-4 models and J. Batson for facilitating early API access to Claude models. Thanks to S. Schmid for careful proofreading. Thanks to R. Haase, I. Theodoro and J. Bragantini for code contributions and A. Jacobo for suggestions and discussion. Chan Zuckerberg Biohub San Francisco (CZB SF) funded this work. I thank the CZB SF donors P. Chan and M. Zuckerberg for their generous support.

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Loïc A. Royer.

Ethics declarations

Competing interests

The authors declare no competing interests.

Peer review

Peer review information

Nature Methods thanks Constantin Pape and the other, anonymous, reviewer(s) for their contribution to the peer review of this work. Primary Handling Editor: Rita Strack, in collaboration with the Nature Methods team.

Supplementary information

Supplementary Information

Supplementary Figures 1–3, Methods, legends for Videos 1–24 and legends for Tables 1–3.

Reporting Summary

Supplementary Video 1

Omega can segment nuclei with StarDist and perform follow-up analysis. The video showcases Omega’s ability to segment cell nuclei in a 2D image using Stardist. Omega successfully segments the nuclei and adds a label layer to the napari viewer. With further instructions, Omega can count the segmented nuclei and create a CSV file on the desktop folder of the machine. This file contains coordinates and areas of all segments, sorted by decreasing area, with one segment per row. Omega also opens the file using the system’s default CSV viewer. The video has been sped up by a factor of 2.

Supplementary Video 2

Omega can segment nuclei in a 3D image. This video shows how Omega segments the nuclei in a 3D image displayed in the napari viewer. Omega uses a specialized tool for cell and nuclei segmentation and employs a ‘classic’ approach that combines single thresholding, specifically Otsu, with watershed splitting to prevent under-segmentation. After segmentation, Omega adds a labels layer to the viewer, and we inquire about the number of segments detected. The response is 27. The video has been sped up by a factor of 2.

Supplementary Video 3

Omega can devise step-by-step strategies and interactively execute them. In this video, we requested Omega’s assistance develo** a detailed strategy for segmenting nuclei in a 2D image. We clarified that the nuclei appear brighter than the background. Omega provided us with a 6-step plan. The first step involved loading the image into napari, which was already done. Next, Omega suggested applying a Gaussian filter to smoothen the image and eliminate noise. However, since the image was not noisy, we asked Omega to move on to step 3, which involved thresholding. Using the scikit-image library, Omega utilized the Otsu method to determine the threshold value and change the image to binary form. As a result, a new layer was added to the viewer with the outcome. We then asked Omega to implement step 4, which involved morphological operations to remove minor artifacts and separate touching nuclei. We specifically requested two erosions. However, we were unsure whether applying grey morphology operators to the original would be more sensible. Omega agreed and provided us with an updated plan that swapped the order of thresholding and erosion. We started over and used the new plan, beginning with step 3 and proceeding to steps 4 and 5, resulting in a reasonably good segmentation. The video has been sped up by a factor of 2.

Supplementary Video 4

Omega can make widgets on demand, e.g., to filter segments per area. In this video, we first ask that Omega segment the nuclei in the currently selected 2D image. Then, we tell Omega to make a widget that can filter the segments in a label layer according to their area. Segments whose areas are outside of a given range are removed from the newly created labels layer. We then start using that widget and experiment with the two parameters: min area and max area. The video has been sped up by a factor of 2.

Supplementary Video 5

Omega can make complex widgets such as for color max projection. In the video, we requested that Omega create a widget for max color projection of a 3D stack within the napari viewer. The hue variation represents the depth within the stack where the maximum intensity is observed, illuminating the spatial arrangement of the nuclei. Luminance correlates with the maximum intensity of the voxels, highlighting areas of peak fluorescence. Saturation reflects the contrast between the maximum intensity and the average intensity across depth thus suppressing hue variation caused by noise. The video has been sped up by a factor of 2

Supplementary Video 6

Omega’s AI-augmented code editor. In this video, we request a widget that visualizes the Fourier spectrum of a 2D image. Omega makes such a widget, which we test on the camera image. We then can find the source code generated by Omega for that widget in the code editor. We then show the different features of the code editor, such as the code cleanup tool, as well as the AI-powered code safety check tool, code commenting tool, and code modification tool.

Supplementary Video 7

Sharing code and widgets across machines. This video shows how Omega’s code editor can send code across the network to another instance. There is no need to copy and paste the code and send it via email or messaging tool. Simply choose the file, choose the recipient, and it will be sent. All other open instances of the code editor running on machines connected to the same network will be automatically discovered as potential recipients.

Supplementary Video 8

Omega can also work with other LLM models besides ChatGPT. This short video shows that Omega also works with Anthropic’s Claude LLM model. The video has been sped up by a factor of 2.

Supplementary Video 9

Omega corrects its own coding mistakes. In the video, Omega applied the SLIC super-pixel segmentation algorithm to a selected image. However, Omega made a mistake using the non-existent ‘multichannel’ parameter when using the scikit-image SLIC function, resulting in an error. Omega noticed this mistake and corrected it on the second try, successfully adding the segmented image to the viewer. The video has been sped up by a factor of 2.

Supplementary Video 10

Omega can search and open image file from the web. In this video, we requested Omega to open a dataset from Blin et al.’s PLOS Biology 2019 in napari. The dataset can be accessed online and streamed using the ZARR image file format and library. Omega was able to fulfill our request successfully letting us explore the dataset. Next, we requested Omega to open a picture of Albert Einstein in napari. Omega then utilized its web image search function to locate a suitable image and loaded it into napari. The video has been sped up by a factor of 2.

Supplementary Video 11

Omega can teach concepts in image processing. In this video, we ask Omega what it knows about ‘gradient-based image fusion.’ Omega then proceeds to give an interesting explanation of the general idea behind this approach to image fusion. We then ask Omega to apply these ideas and make a widget that takes two image layers and returns the gradient-based image fusion of these two images. Omega successfully creates a functional widget that we test on two images. The video has been sped up by a factor of 2.

Supplementary Video 12

Omega can do math and write arbitrary Python code. In this video, we test Omega’s Python coding skills by asking some basic math questions. For example, we asked for the value of 1010+1 and the number of permutations possible with ten objects. Then, we asked Omega to write all permutations of a list of 5 strings (‘a’, ‘b’, ‘c’, ‘d’, ‘e’) to a file on the machine’s Desktop folder, with one permutation per row. Omega completed this task and opened the file using the system’s default text viewer. Following this, we asked to create a new file containing only permutations where the letters ‘a’ and ‘b’ are consecutive, providing some examples. However, we soon realized that our statement could have been clearer as it was ambiguous whether the order of ‘a’ and ‘b’ mattered. The video has been sped up by a factor of 2.

Supplementary Video 13

Omega can control the napari viewer. This video showcases how Omega can manage the napari viewer window. Initially, we requested to change the viewer to 3D rendering mode. Subsequently, we ask it to rotate the orientation of the 3D image by 20 degrees on all axes and zoom in by 50% twice. Then, we request to modify the gamma setting of all layers to a value of 1.5. Finally, we eliminate all layers in the viewer except for the ‘nuclei’ one. Lastly, we zoom out and switch back to 2D rendering mode. The video has been sped up by a factor of 2.

Supplementary Video 14

Omega can determine how to call Python functions. In the video, we requested information from Omega regarding the convolution function in scipy’s ndimage package. Omega provided an extensive explanation of the function signature and details about the parameters. However, when we asked to apply the function to a selected image, it generated code for a 2D image instead of a 3D image. After informing Omega that the image was, in fact, 3D, it was able to apply the function successfully with appropriate default parameters. The video has been sped up by a factor of 2.

Supplementary Video 15

Omega can use Cellpose to segment cells and nuclei. This brief video showcases how Omega utilizes Cellpose to segment cell nuclei in a 2D image (z-projection). The video has been sped up by a factor of 2.

Supplementary Video 16

Omega can use Aydin to denoise images. This video showcases Omega’s access to our image-denoising tool Aydin. We first ask Omega to apply Aydin’s Noise2Self-FGR (Feature Generation & Regression) approach on a noisy single-channel photograph of the New York skyline (see detailed use case and tutorial here). We see some console output from Aydin running within Omega, and eventually, it displays a denoised version of the image overlayed as a new layer in napari. Next, we ask Omega to apply the same denoising algorithm to a 3D image of Drosophila Egg Chambers (LimSeg Test dataset, Machado et al.), which it does successfully. The video has been sped up by a factor of 2.

Supplementary Video 17

Omega can follow detailed instructions and has extensive knowledge of NumPy. In this video that runs for about 20 minutes, we demonstrate the process of creating a piece of ‘Digital Art’ by giving Omega detailed step-by-step instructions. We begin by requesting Omega to generate an empty image and continue by progressively altering it. We add noise and apply functions to the pixel values, threshold, and segment structures. This video highlights Omega’s proficiency in NumPy operations and the extensive text conversations that can be utilized for image processing and analysis. The video has been sped up by a factor of 2.

Supplementary Video 18

Omega knows how to use the scikit-image library for processing and analyzing images. This video showcases Omega’s mastery of the scikit-image library and image processing. We asked Omega to segment an image with bright coins on a dark background, but we realized that the background was not uniform. To correct the background, we consulted with Omega and learned about different algorithms that could be used. Initially, we attempted to use the rolling-ball algorithm, but we encountered some issues due to Omega’s use of a white tophat filter instead of a black tophat filter. We then tried CLAHE (Contrast Limited Adaptive Histogram Equalization), which worked reasonably well, but perhaps we should have used larger tiles. The video has been sped up by a factor of 2.

Supplementary Video 19

Omega knows how to use OpenCV. In this video, we requested Omega to download an MP4 movie using the provided URL. The movie displays a hallway and people passing by – a commonly used video for testing person detection algorithms. We then asked Omega to utilize the OpenCV library to detect people in each movie frame and draw a bounding box around each detection. Omega complied with our request and displayed each frame and bounding boxes around each detected person. However, we observed two issues. Firstly, adding each 2D movie frame as individual napari image layers is impractical, resulting in many layers. Secondly, OpenCV’s RGB channel ordering is incompatible, causing the napari viewer to display incorrect colors for each frame. The video has been sped up by a factor of 2.

Supplementary Video 20

Omega knows how to use Numba. In the video, we asked Omega to perform a z-projection of a 3D image using the Numba library to speed up the code through just-in-time compilation. Although we did not specify the projection type, Omega used the reasonable choice of max projection and successfully computed it. However, during the process, Omega utilized the NumPy function np.max() in the just-in-time compiled function, defeating our purpose. We then requested Omega to refrain from using NumPy functions and instead write a z-projection loop. Omega completed the task, but this time, it opted for an average projection. We later explicitly asked Omega to perform a max projection. The video has been sped up by a factor of 2.

Supplementary Video 21

Omega knows how to use CuPy. This video presents Omega’s proficiency in utilizing the GPU-accelerated CuPy library. Initially, we requested Omega to confirm the installation and functionality of CuPy. Subsequently, we instruct Omega to perform a z-projection of all images displayed in napari. The video has been sped up by a factor of 2.

Supplementary Video 22

Omega can dialog in many different languages. In this video, we speak with Omega in French. This is possible because most LLMs (ChatGPT, Claude, and others) are naturally multilingual. Omega replies to the user in French, but the tools used still operate internally in English, as most of the prompt templates are written in that language. We have tested Omega in several languages, including Spanish, Italian, German, and even Chinese. This feature enhances accessibility to non-English speakers. The video has been sped up by a factor of 2.

Supplementary Video 23

Omega can ‘see’. In this video, we test Omega’s ability to see by giving it a visual puzzle to solve. We load 4 images (black & white horse, cup of coffee, cat face, camera test image), change their layer names to ‘A’, ‘B’, ‘C’, ‘D’, and then ask Omega to find which of the four images depicts a cup of coffee. Omega uses its ‘napari viewer vision tool’ to see the contents of the napari viewer, correctly describe each image, and identify the one depicting a cup of coffee as the one on the top right corner. We further ask for the name of the layer, and Omega uses again its vision tool to verify that layer ‘B’ holds that image. The final answer is the correct one: ‘B’. The video has been sped up by a factor of 2.

Supplementary Video 24

Omega decides how to best segment an image using vision. In this video we present Omega with two 2D microscopy images: one with cytoplasm labeled (b) and another with nuclei labeled (a). We then ask Omega to decide how to best segment the biological structures present in each image by using the vision tool. Omega looks at the image contents, describe them, and correctly decides that Cellpose is best for layer b and StarDist is best for layer a.

Supplementary Table 1

Example widgets. Here is a list of example widgets that can be reliably generated using Omega. These prompts can be modified, adjusted, and extended in many ways. If Omega can’t make a functional plugin the first time, or if the result is not exactly what is asked for, being more explicit and asking Omega to ‘try again’ often works. Ideally, these widgets are made once and then can be reused by running the code in Omega’s code editor (see Supp. Fig. 2).

Supplementary Table 2

Reproducibility Analysis. We conducted a reproducibility analysis in which two complex multi-step prompts and three widget generation prompts were run ten times to assess reproducibility – a total of 50 runs. The results suggest that widget generation is more robust than complex multi-step tasks. We observed a 90% reproducibility rate (ratio of successful attempts versus all ten attempts). Each run was run independently with a blank conversation history. In general, our observation is that if Omega fails to follow instructions, it often suffices to ask Omega to “try again”.

Supplementary Table 3

Prompt Table. This table lists all prompts used for all supplemental videos.

Rights and permissions

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Royer, L.A. Omega — harnessing the power of large language models for bioimage analysis. Nat Methods (2024). https://doi.org/10.1038/s41592-024-02310-w

Download citation

  • Published:

  • DOI: https://doi.org/10.1038/s41592-024-02310-w

  • Springer Nature America, Inc.

Navigation