Documentation Index

Fetch the complete documentation index at: https://kb.ctera.com/llms.txt

Use this file to discover all available pages before exploring further.

General and Port Requirements

Prev Next

CTERA Content Services includes the following functionality:

  • CTERA Search
  • CTERA Classify
  • CTERA Expert

The CTERA Content Services configuration includes the following components:

  • An Ingestor
    Takes information, breaks it down, and prepares it for the LLM to process.
  • A Collector
    Gathers and organizes the data required to train or operate AI models (like LLMs), which require high-quality datasets to learn.
  • An Embedder
    A specialized machine learning algorithm that translates human-readable data—like text, images, or audio—into a machine-readable vectors.

These components can be deployed on a single VM. The Ingestor can also be installed on a VM and the Collector and Embedder can then be installed on a second VM.

Note

These configurations can be extended to include more than one collector.

For optimum performance the collector and embedder VM should be deployed close to the storage bucket where the raw data is collected, to reduce latency. The Ingestor component should be deployed close to the CTERA Portal.

Separating the ingestor component from the collector and embedder components separates the deployments into a data plane consisting of the collector and embedder components and a control plane consisting of the ingestor component.

Installing all the components on a single VM can be used for testing and proof of concept deployments, but for a production environment the deployment of the data and control planes should be on different VMs.

Software Requirements

The following software requirements are for the configuration when all three components are installed on the same VM. When an external edge collector is used, CTERA recommends that it includes a minimum of 2 NVIDIA L40S GPUs. The resources for an external edge collector VM are dependent on the LLLM models that are used by CTERA Content Services. Contact CTERA Support for help with sizing for this VM.

  • A virtual machine running Ubuntu 64-bit 24.04.3 LTS (Noble Numbat) or later. By default, this VM will host both CTERA Content Services and as an edge collector.

    Minimum Recommended
    CPU 8 –
    RAM 32GB 64GB
    Disk 500GB 1TB
    NVIDIA L40S GPU 2 –

    The VM must have a fixed IP and the DNS configured for remote access:
    For example:

    A            <base_URL>.com                 <base_URL_IP>
    A            *.<base_URL>.com               <base_URL_IP>
    

    Where <base_URL> is the DNS suffix and base_URL_IP is the IP of the Content Services VM.

  • When one VM will host CTERA Content Services and anotherVM will host an edge collector.

    • CTERA Content Services VM
      Minimum Recommended
      CPU 8 –
      RAM 32GB 64GB
      Disk 500GB 1TB
    • Collector VM
      GPU with 48GB VRAM and 1TB Disk

General Requirements

  • A browser with access to the Internet. You can use any of the latest two releases of Google Chrome, Apple Safari, Microsoft Edge, and Mozilla Firefox.
  • Microsoft Entra ID P1. For details, see Setting Up SSO in Azure Entra ID.
    Groups of end users require the role ctera-ai-user assigned to the OPENID_USER_ROLE_ID role ID in Entra ID.
    Groups of administrators require the role ctera-ai-admin assigned to the OPENID_ADMIN_ROLE_ID role ID in Entra ID.
  • The latest Content Services OVA file, provided by CTERA Support.
    Note

    On platforms other than ESXi, a token is required from CTERA Support to access the CTERA docker repository in Azure Container Registry (ACR) to retrieve the latest CTERA Content Services dockers.

Inbound Ports

Port Protocol Notes
22 TCP SSH. CTERA recommends limiting SSH access to specific IP addresses that may require access to the CTERA application servers, for example to perform scheduled maintenance and support related work.
80 HTTP
443 TCP HTTPS (chat, administration, and SSO callbacks)
11434 TCP Optionally, to run the LLM using Ollama

Outbound Ports

Port Protocol Notes
443 TCP HTTPS for access to the docker registries and Azure Entra ID SSO

Internal Ports (not requiring Internet access)

Port Protocol Notes
6000 TCP Required only on edge collector server for data collection
7997 TCP For the Infinity embedding Inference server
7998 TCP For the Python library
8080 TCP Access to Dozzle, the web-based interface to monitor docker logs, for troubleshooting

Other ports can be optionally opened for maintenance. For the full list, contact CTERA Support.