Back to posts

From SDK to SSR: Performance Optimization Lessons Across Frameworks

12 min readPerformance

Prologue: Building Performance Culture at Rokt

Before diving into my recent work optimizing React/Remix applications at Lorikeet, I want to share the foundation that made this possible: my experience building WSDK2 at Rokt with an incredible team. Together, we achieved a 30% reduction in SDK load time and a 40% decrease in script size, significantly improving performance and user experience across thousands of client integrations.

What made this work successful was not just the optimizations themselves, but the systematic methodology we developed: instrument, measure, identify, optimize. We started with tracer bullet development, building a minimal viable version of WSDK2 to validate our approach before committing to full implementation. This framework proved to be framework-agnostic and powerful enough to apply to entirely different contexts, from third-party SDKs running in iframes to server-side rendered React applications.

Why Performance Measurement Matters

You cannot improve what you do not measure. This principle guided both my work at Rokt and now at Lorikeet. Performance optimization without data is just guesswork, you need concrete numbers to identify bottlenecks, validate improvements, and communicate impact to stakeholders.

Throughout my career, I have employed various performance measurement techniques: analyzing HAR (HTTP Archive) files to understand network waterfalls, running Lighthouse audits for comprehensive performance scores, and deep-diving into Chrome DevTools Performance tab for CPU profiling and rendering analysis. While these tools are invaluable, this blog focuses specifically on the instrumentation-based approach that proved most impactful for the React/Remix optimization work at Lorikeet. The methodology described here is particularly effective for identifying and resolving server-side rendering bottlenecks in production environments.

Performance Markers: The Foundation of Optimization

At Rokt, we instrumented WSDK2 with Date.now() markers throughout the initialization flow. Why Date.now() instead of performance.now()? Because we were dealing with cross-origin iframes, measuring performance from both first-party (client website) and third-party (Rokt SDK) perspectives.Date.now() provides consistent timestamps across iframe boundaries, whereasperformance.now() is relative to each browsing context's time origin.

For the Remix/React work at Lorikeet, we switched to performance.now()in server-side loaders because we are measuring within a single Node.js process context.performance.now() offers higher precision (microsecond resolution) and monotonic timing that is immune to system clock adjustments.

Key Insight: Choose Your Timer Wisely

  • Use Date.now() when measuring across different contexts (iframes, workers, multiple browser tabs)
  • Use performance.now() for high-precision measurements within a single JavaScript context
  • Use performance.mark() for integration with browser DevTools and the Performance API

Collecting Production Timing Data

After instrumenting our code with performance markers, we deployed to production and collected real-world timing data. This step is crucial,synthetic tests and local development environments do not capture the variability of real user conditions: network latency, device capabilities, concurrent resource loading, and cache states.

Here's what typical timing data looked like for a slow Remix route:

timing: {
  authDurationMs: "100-500ms",
  configQueryMs: "50-200ms",
  mainContentQueryMs: "1000-2500ms",  // ⚠️ Bottleneck!
  auxiliaryQueryMs: "20-200ms",
  totalLoaderMs: "1200-3000ms"
}

The data revealed the primary bottleneck: a query dominating 70-80% of total load time. Identifying which operations consume the most time is exactly the kind of insight you need to prioritize optimization efforts effectively.

From Rokt to Lorikeet: Applying Lessons to React/Remix

At Lorikeet, we had a critically slow page in our web application. Average page load time was hovering around 2.2 seconds, far too slow for a good user experience. Applying the systematic methodology from the lessons learned from building the Rokt WSDK2, we achieved remarkable results:

Performance Improvement: 2.2s → ~700ms

  • 68% lower latency
  • 🚀 Roughly 3x faster perceived load time
  • 📈 Achieved through systematic optimization, not magic

React/Remix Performance Optimization Patterns

The performance instrumentation quickly revealed our bottlenecks. Armed with real data showing exactly where time was being spent, we applied three key optimization patterns. Each pattern addresses a specific type of performance issue we discovered through our timing analysis.

These aren't theoretical optimizations, they're battle-tested patterns recommended by the Remix team and taught in depth by Kent C. Dodds in his Advanced Remix Frontend Masters course. These patterns took our slowest pages from unusable to fast. Here's what worked:

Pattern 1: Parallel Query Execution

The first issue we found? Queries that didn't depend on each other were running one after another instead of simultaneously. This is one of the most common performance anti-patterns in async JavaScript. When queries are independent, they should run in parallel, period.

Here's what we were doing wrong and how we fixed it:

Before (sequential - slow):

// ❌ configData blocks independentQuery unnecessarily
const configData = await fetchConfig(auth)

// This starts AFTER configData completes (bad!)
const independentQueryPromise = fetchIndependentData(auth)

const dependentQueryPromise = fetchDependentData(configData)

await Promise.all([independentQueryPromise, dependentQueryPromise])

After (parallel - fast):

// ✅ Start independent queries immediately
const independentQueryPromise = fetchIndependentData(auth)

// Only await config when actually needed
const configData = await fetchConfig(auth)

const dependentQueryPromise = fetchDependentData(configData)

await Promise.all([independentQueryPromise, dependentQueryPromise])

Key insight: Start independent queries before awaiting dependencies they don't need. This simple reordering can shave hundreds of milliseconds off your critical path.

Pattern 2: Deferred Loading with Progressive Hydration

Remix's defer() utility enables one of the most powerful performance patterns for SSR applications: streaming non-critical data after navigation. This dramatically improves perceived performance, users see content instantly while data continues loading in the background.

export const loader = async ({ request }: LoaderFunctionArgs) => {
  const auth = await enforceProtectedRoute({ request })

  // Critical data: needed for page structure
  const filters = await fetchFilters(auth)
  const pagination = { page: 1, pageSize: 20 }

  // Non-critical data: defer these!
  const dropdownOptionsPromise = fetchDropdownOptions(auth)
  const sidebarDataPromise = fetchSidebarData(auth)

  return defer({
    // Synchronous: page renders immediately
    filters,
    pagination,

    // Deferred: streams in after navigation
    deferredDropdownOptions: dropdownOptionsPromise,
    deferredSidebarData: sidebarDataPromise,
  })
}

On the component side, use Suspense with Await to progressively hydrate deferred data:

<Suspense fallback={<SkeletonLoader />}>
  <Await
    resolve={deferredDropdownOptions}
    errorElement={<ErrorFallback />}
  >
    {(options) => <FilterDropdown options={options} />}
  </Await>
</Suspense>

Pattern 3: Defer Main Content with Skeleton UI

When a single query dominates your total load time (e.g., taking 80-90% of the total duration), deferring secondary UI elements is not enough. In this case, defer the main content itself and show a skeleton table immediately.

This was the game-changing optimization for slow pages. By deferring the expensive query and rendering a skeleton UI instantly, we transformed the user experience from "waiting several seconds staring at a loading spinner" to "seeing the page structure immediately with content populating progressively."

return defer({
  // Synchronous: page shell renders instantly
  filters: currentFilters,
  pagination: { page, pageSize },

  // DEFERRED: Main content (slow query)
  deferredMainContent: fetchMainContent(filters),
})

Skeleton UI implementation:

<Suspense
  fallback={
    <div className="flex flex-col gap-4 pt-4">
      {[...Array(10)].map((_, i) => (
        <div className="flex items-center gap-4 py-2" key={i}>
          <div className="h-4 w-16 animate-pulse rounded bg-gray-200" />
          <div className="h-4 w-48 animate-pulse rounded bg-gray-200" />
          <div className="h-4 w-32 animate-pulse rounded bg-gray-200" />
        </div>
      ))}
    </div>
  }
>
  <Await resolve={deferredMainContent}>
    {(data) => <ContentTable data={data} />}
  </Await>
</Suspense>

Common JavaScript Performance Pitfalls

Sequential Loops vs Promise.all

One of the most impactful optimizations we made involved converting sequential loops to parallel execution. This pattern appears frequently in backend services and loaders.

Anti-pattern (sequential - slow):

async function fetchUserData(userIds: string[]) {
  const results = []

  // Each iteration waits for the previous one
  for (const userId of userIds) {
    const userData = await fetchUser(userId)
    results.push(userData)
  }

  return results
}

// If you have 5 users, each taking 100ms → 500ms total

Optimized (parallel - fast):

async function fetchUserData(userIds: string[]) {
  // All requests fire simultaneously
  const results = await Promise.all(
    userIds.map(userId => fetchUser(userId))
  )

  return results
}

// All 5 users fetch in parallel → 100ms total (limited by slowest)

Visual Comparison:

Sequential (500ms total):

User 1 ━━━━━━━━━━ 100ms
       User 2 ━━━━━━━━━━ 100ms
              User 3 ━━━━━━━━━━ 100ms
                     User 4 ━━━━━━━━━━ 100ms
                            User 5 ━━━━━━━━━━ 100ms

Parallel (100ms total):

User 1 ━━━━━━━━━━┐
User 2 ━━━━━━━━━━┤
User 3 ━━━━━━━━━━┼━━ 100ms (limited by slowest)
User 4 ━━━━━━━━━━┤
User 5 ━━━━━━━━━━┘

Sequential Promise.all Chains

Another common anti-pattern is chaining multiple Promise.all calls when the second batch doesn't actually depend on the first batch's results.

Before (sequential batches - slow):

async function fetchAggregatedData(userId: string) {
  // First batch
  const [dataA, dataB] = await Promise.all([
    fetchDataA(userId),
    fetchDataB(userId),
  ])

  // Second batch waits for first (unnecessarily!)
  const [dataC, dataD, dataE] = await Promise.all([
    fetchDataC(userId),
    fetchDataD(userId),
    fetchDataE(userId),
  ])

  return { dataA, dataB, dataC, dataD, dataE }
}

After (single parallel batch - fast):

async function fetchAggregatedData(userId: string) {
  // ALL queries run in parallel
  const [dataA, dataB, dataC, dataD, dataE] = await Promise.all([
    fetchDataA(userId),
    fetchDataB(userId),
    fetchDataC(userId),
    fetchDataD(userId),
    fetchDataE(userId),
  ])

  return { dataA, dataB, dataC, dataD, dataE }
}

Accidental Sequential forEach

Array methods like forEach, map, and filterdo not await promises automatically. This can lead to subtle race conditions.

Anti-pattern (does not wait):

// This does not wait for async operations!
const results = []
items.forEach(async (item) => {
  const result = await processItem(item)
  results.push(result)
})
// results is still empty here!

Correct (parallel execution):

// Use Promise.all with map
const results = await Promise.all(
  items.map(item => processItem(item))
)

Performance Metrics: Perceived vs Actual

An important lesson from both my Rokt and Lorikeet work is understanding the difference between actual performance (how long operations take) and perceived performance (how fast the application feels to users).

"Defer does not make your queries faster, it makes your application feel faster by rendering content progressively while data loads in the background."

This distinction is crucial. Deferring a slow query does not make it faster, the query still takes the same time to execute. What changed is the Time to First Contentful Paint (FCP), users see meaningful content immediately instead of waiting for everything to load before seeing anything.

Best Practice: Track Both Metrics

  • FCP (First Contentful Paint): Perceived load time - when users see content
  • Query Duration: Actual backend performance - how long operations take
  • LCP (Largest Contentful Paint): When main content becomes visible
  • TTI (Time to Interactive): When the page becomes fully interactive

Defer is a powerful UX tool for resilience, not a substitute for backend optimization. The ideal scenario is fast backend queries + progressive rendering = exceptional user experience. Defer alone provides acceptable UX while masking technical debt.

Decision Framework: What to Defer vs Await

Not all data should be deferred. Use this framework to decide what to await synchronously vs defer for streaming:

Always Await (Critical Data)

  • Authentication and authorization checks
  • Data that determines page structure/layout
  • Data that other queries depend on
  • Basic page shell elements (filters, navigation, breadcrumbs)

Consider Deferring (Non-Critical Data)

  • Filter dropdown options
  • Secondary panels and sidebars
  • Enhancement data (tooltips, metadata, analytics)
  • Data only needed after user interaction

Defer Main Content (When Justified)

When a query dominates total load time (90%+ of total duration), consider deferring even main content:

  • Large data tables or lists: show skeleton tables
  • Dashboard charts: show loading chart placeholders
  • Search results: show skeleton cards
  • Feed-style content: show skeleton posts

Questions to Ask Before Deferring

  1. Can users see meaningful content without this data?
  2. Does this data populate a secondary UI element?
  3. Is there a reasonable loading/skeleton state?
  4. Do other queries depend on this result?
  5. Is this query the primary bottleneck (90%+ of load time)?

Important Limitations: Remix Defer Bug

Remix's defer() has a known issue (issue #6637) where it does not work correctly on same-route navigation with changed URL parameters:

ScenarioDefer Works?
Initial page load✅ Yes
Navigate to different route✅ Yes
Change filters/date on same route❌ No (waits for all data)

Impact: The defer pattern still provides significant benefits for initial loads and cross-route navigation. When it "fails" on same-route navigation, it degrades gracefully to normal await behavior - no worse than before optimization.

Mitigation: The main performance wins come from parallel query restructuring, which works regardless of this bug.

Key Takeaways

  1. Measure first, optimize second. Use performance markers (Date.now() for cross-context, performance.now()for high-precision) to collect production timing data before making changes.
  2. Parallelize independent queries. Start queries that don't depend on each other simultaneously, don't let unnecessary await calls block execution.
  3. Defer non-critical data. Use Remix's defer() to stream secondary UI data after initial page render, dramatically improving perceived performance.
  4. Consider deferring main content. When a single query dominates load time (90%+ of total), defer it with skeleton UI for instant page rendering.
  5. Convert sequential loops to Promise.all. Replacefor...of loops with await inside usingPromise.all(items.map(...)) for parallel execution.
  6. Track perceived vs actual performance. Measure both FCP (when users see content) and query duration (actual backend performance) separately.
  7. Defer is resilience, not a fix. Use defer to improve UX immediately, but always investigate and optimize slow queries at the source.

Conclusion

Performance optimization is a systematic discipline, not magic. The methodology I learned while building WSDK2 at Rokt became the blueprint I used for tackling performance issues at Lorikeet. Instrument, measure, identify, optimize. This framework proved powerful enough to apply across entirely different technology stacks. Whether you are optimizing a third-party SDK loading in iframes or a server-side rendered React application, the principles remain the same.

At Lorikeet, we reduced page load time from 2.2 seconds to ~700ms (68% improvement, 3x faster) by applying these patterns systematically: parallel query execution, deferred loading with progressive hydration, and skeleton UI for slow queries. The result is a dramatically better user experience with instant navigation and progressive content loading.

Performance is not just about speed, it is about creating delightful user experiences that keep visitors engaged. Start measuring today, identify your bottlenecks, and apply these patterns to transform your application's performance.

From Personal Learning to Team Capability

After solving these performance challenges, I documented everything I learned and turned it into a Claude Skill called remix-performance-optimizer. This transformed my personal expertise into institutional knowledge that anyone on my team can leverage automatically.

Now when teammates encounter slow pages, Claude automatically applies these same optimization patterns without needing to remember the methodology or search through documentation. It's like uploading kung fu directly into Claude's brain - individual expertise becomes a team superpower.

Want to learn how to turn your expertise into team capability?

I wrote a detailed guide on using Claude Skills to transform personal knowledge into institutional capability that works automatically for your entire team.

Read: Claude Skills - Turning Personal Expertise into Team Superpowers

© 2026 Cris Ryan Tan. All rights reserved.

Built with Gatsby, React, and Tailwind CSS