IBM Unveils Granite Code

2024 May, 07

Source Link

In a groundbreaking move for the programming community, IBM has just unveiled "Granite Code," a suite of eight new open-source Language Model (LLM) variants ranging from 3 billion to 34 billion parameters. These models, built on the innovative Llama architecture, represent a significant leap forward in natural language processing capabilities for developers worldwide. Harnessing the power of BigCode Stack and GitHub, IBM's Granite models have been meticulously trained on a staggering 116 programming languages, providing unprecedented contextual understanding. The training process involved two phases: initial training on code-only data, followed by refinement with high-quality code paired with language context, resulting in models capable of processing over 500 billion tokens. Notably, the flagship 34 billion-parameter model, Granite 8B, has demonstrated superior performance compared to other open LLMs such as CodeGemma or Mistral in various benchmarks. Even more impressive is its purported support for COBOL, a language often deemed archaic but still widely used in legacy systems. These models were trained using IBM's Vela and Blue Vela supercomputers and are released under the permissive Apache 2.0 license, ensuring broad accessibility and adoption within the developer community. They are readily available on platforms like Hugging Face for immediate integration into projects. However, it's important to note that while the models have been made openly available, the datasets used for training have not been released, and there is no mention of decontamination efforts in the accompanying paper. Nonetheless, the release of Granite Code marks a significant milestone in the advancement of open AI technologies, promising to revolutionize how developers approach code generation and understanding.