To meet these challenges, the team designed an AI-powered data and content pipeline that merged domain-specific LLM usage with live search integration and sector-based logic.
1. Multi-Level Sector Modeling and Parameterization
Using Python, PySpark, and AWS Glue, the team built a processing layer that structured companies according to their GICS sectors. Each sector came with its own set of evaluation metrics – typically 7-8 key economic indicators relevant for macro analysis. For example:
- In Banking: inflation rates, central bank policy, financial system stability.
- In Energy: commodity prices, regulatory shifts, supply-demand forecasts.
This logic ensured that every company could be analyzed in context – not only by its financials but by the sectoral trends shaping its environment.
2. Region-Specific Macro Analysis Framework
The pipeline then extended this logic to regional macroeconomics. For each sector, macro narratives were generated across multiple layers:
- High-level regions: North America, MENA, Asia, Europe.
- Then narrowed to country-specific insights: U.S., Japan, Israel, Germany, Thailand, Brazil.
This hierarchy allowed the team to match every company with the most appropriate macroeconomic context. For example, a U.S.-based bank would receive analysis based on the latest American inflation figures and Federal Reserve policies, while a Japanese manufacturer would be aligned with Asia-Pacific trends.
3. LLM Text Generation with Live Internet Context
To overcome the limitations of static LLM knowledge, the team integrated Google Search Tool directly into the Gemini pipeline. When prompted to produce macroeconomic text, the LLM first triggered a series of targeted online searches – for instance:
- “Interest rate trends 2025 site:bloomberg.com”
- “Inflation forecast Q2 2025 central banks”
The search results were used as real-time context for the generation process. Gemini no longer relied on outdated training data. Instead, it wrote based on the latest information gathered moments earlier.
To improve relevance and quality, the system was instructed to prioritize high-authority sources with wide citation reach, such as Bloomberg, CNN, and CNBC. This flexible filtering method worked better than hardcoded source lists.
4. Experimentation, Monitoring, and Tooling
All experiments and outputs were tracked using LangChain, LangFuse, and Deepnote. These tools allowed the technical team to monitor prompt behavior, track changes in generation logic, and rapidly iterate on performance.
5. Final Company-Macro Matching Engine
At the final stage, each company was matched with the best available macroeconomic analysis. The matching logic worked in layers – starting from country-level (if available) and falling back to regional insights when country-specific data was missing.
The resulting profiles combined:
- Internal scoring (summary recommendation: buy/hold/sell)
- Company-specific metrics (profitability, investment activity)
- Region/sector-aware macroeconomic context
This blend ensured that the final output was both numerically grounded and economically current.